1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Development Any SQL whizzkids out there?

Discussion in 'Tech Support' started by Sketchee, 25 Nov 2015.

  1. Sketchee

    Sketchee Suddenly, looters! Hundreds of 'em!

    Joined:
    23 Oct 2010
    Posts:
    544
    Likes Received:
    40
    So I'm struggling with what I hoped would be a fairly straightforward select query and it's causing me a massive headache... anyone feel like they could be of a little assistance to me? :)
     
  2. theshadow2001

    theshadow2001 [DELETE] means [DELETE]

    Joined:
    3 May 2012
    Posts:
    5,284
    Likes Received:
    183
    I'll give it a shot, as long as its not your homework.
     
  3. deathtaker27

    deathtaker27 Modder

    Joined:
    17 Apr 2010
    Posts:
    2,238
    Likes Received:
    186
    feel free to send it my way as well :)
     
  4. Sketchee

    Sketchee Suddenly, looters! Hundreds of 'em!

    Joined:
    23 Oct 2010
    Posts:
    544
    Likes Received:
    40
    Cheers, I'll post it up here then.

    Using postgis to do this. I've got 3 tables of data: person (p), postcode (po) and roads (r). postcode and roads are point and line features from shapefiles so contain spatial information. person also contains postcode data so I can use a join on that. The following is my query which returns the information I want but I need to have it so that it orders by dist_m in ascending order.

    Code:
    SELECT DISTINCT ON (p.caseid) p.caseid, p.surname, p.postcode, r.dftnumber AS closest_a_road, ROUND((ST_Distance(r.geom, po.geom))::NUMERIC, 1) AS distance_m
    FROM env.roads r, env.postcode po
    INNER JOIN env.person p
    ON po.postcode = p.postcode
    WHERE r.classifica = 'A Road'
    ORDER BY p.caseid, ST_Distance(r.geom, po.geom);
    This is linking each record in the person table to the nearest A road from the roads table. If I don't use DISTINCT on the ID for each person it will return a record for each person to every A road, whereas I only want the closest A road.. so the shortest distance in the column distance_m. However using DISTINCT ON p.caseid means I can't order by the distance... any suggestions?
     
  5. theshadow2001

    theshadow2001 [DELETE] means [DELETE]

    Joined:
    3 May 2012
    Posts:
    5,284
    Likes Received:
    183
    Its a little hard to tell without looking at the full schema and running some test queries, but would doing something like this get you the details you need?

    Code:
    SELECTp.caseid, MIN(ROUND((ST_Distance(r.geom, po.geom))::NUMERIC, 1)) FROM
    FROM env.roads r, env.postcode po
    INNER JOIN env.person p
    ON po.postcode = p.postcodep
    GROUP BY p.caseid
     
  6. Sketchee

    Sketchee Suddenly, looters! Hundreds of 'em!

    Joined:
    23 Oct 2010
    Posts:
    544
    Likes Received:
    40
    It's an awkward one, it should only return 11 results from the current data I have which it does from the original query. It's just the ordering by distance that doesn't play along and I can't seem to get the MIN aggregater to work. I could provide a backup of the pgadmin database plus query if you fancied a proper look at the data?

    Cheers for the response
     
  7. theshadow2001

    theshadow2001 [DELETE] means [DELETE]

    Joined:
    3 May 2012
    Posts:
    5,284
    Likes Received:
    183
    What about using the column index to order the query. Order by 1, 5
     
  8. Sketchee

    Sketchee Suddenly, looters! Hundreds of 'em!

    Joined:
    23 Oct 2010
    Posts:
    544
    Likes Received:
    40
    That would probably surpass the limited level of knowledge I have of using SQL I'm afraid..
     
  9. theshadow2001

    theshadow2001 [DELETE] means [DELETE]

    Joined:
    3 May 2012
    Posts:
    5,284
    Likes Received:
    183
    Just replace your order by statement "ORDER BY p.caseid, ST_Distance(r.geom, po.geom)" with "order by 1, 5". which means order by the 1st column then the 5th. You can do this instead of using the column names.
     
  10. Sketchee

    Sketchee Suddenly, looters! Hundreds of 'em!

    Joined:
    23 Oct 2010
    Posts:
    544
    Likes Received:
    40
    Unfortunately that still orders the query by p.caseid first. I'm hoping once I've got the results (postcode for each caseid (11 results) and distance to closest A road only) to order the query only by the distance_m column (column 5).
     
  11. theshadow2001

    theshadow2001 [DELETE] means [DELETE]

    Joined:
    3 May 2012
    Posts:
    5,284
    Likes Received:
    183
    Sorry, I thought that the query was failing to execute completely because of the order by statement. I think it may not be possible to do what you want with the query structured in its current form. If you look at the stack overflow post below you can see someone with the same problem as you.

    http://stackoverflow.com/questions/9795660/postgresql-distinct-on-with-different-order-by

    The "distinct on" operator requires that you order by what ever is used with "distinct on" (actually the left most term of what is used with distinct on), so if you use "distinct on" you will have to order by p.caseid, which is obviously not what you want.

    You are going to have to bust out the sub queries to get this one done I think.

    Is there any relationship between the roads table and the postcode table, do they have any columns in common?
     
  12. deathtaker27

    deathtaker27 Modder

    Joined:
    17 Apr 2010
    Posts:
    2,238
    Likes Received:
    186
    I have almost 0 experiance with postgress sql but in MSSQL you can use partition by to achive your requirements, it may be that postgress has this functionality as well?
     
    Sketchee likes this.
  13. Sketchee

    Sketchee Suddenly, looters! Hundreds of 'em!

    Joined:
    23 Oct 2010
    Posts:
    544
    Likes Received:
    40
    No there's no relationship between postcodes and road, they're pretty much just the tables from a shapefile .dbf, they both have geometry which is where the spatial part of the query comes in. That's an interesting post though, I'll have a crack at a sub query when I get back and see how that goes. Appreciate the help, I'm determined to solve it! :lol:
     
  14. Sketchee

    Sketchee Suddenly, looters! Hundreds of 'em!

    Joined:
    23 Oct 2010
    Posts:
    544
    Likes Received:
    40
    I have no idea really, I'm pretty new to sql in general. Normally use arcgis, but I'm using qgis for this in particular so it's all part of the package. Don't think I've come across partitioning before?
     
  15. theshadow2001

    theshadow2001 [DELETE] means [DELETE]

    Joined:
    3 May 2012
    Posts:
    5,284
    Likes Received:
    183
    I think there's a cross join happening in the ST_DISTANCE function. It might not be ideal.
     
    Sketchee likes this.
  16. Sketchee

    Sketchee Suddenly, looters! Hundreds of 'em!

    Joined:
    23 Oct 2010
    Posts:
    544
    Likes Received:
    40
    I think I've solved it. Albeit there's probably a more elegant solution but the following seems to give me the right answer.

    Code:
    SELECT * FROM (
    SELECT DISTINCT ON (p.caseid) p.caseid, p.postcode, r.dftnumber AS "closestARoad", ROUND((ST_Distance(r.geom, po.geom))::NUMERIC, 1) AS distance_m
    FROM env.roads r, env.postcode po
    INNER JOIN env.person p
    ON po.postcode = p.postcode
    WHERE r.classifica = 'A Road'
    ORDER BY p.caseid, ST_Distance(r.geom, po.geom)) AS t
    ORDER BY t.distance_m;
    I've got a few more to do though so I might pop back in here in the near future to pick your brains again :D Really appreciate the help!
     
  17. theshadow2001

    theshadow2001 [DELETE] means [DELETE]

    Joined:
    3 May 2012
    Posts:
    5,284
    Likes Received:
    183
    Well it aint pretty and its not exactly the sub-query I had in mind, but you can't argue with results. :thumb:

    I think your table structures might not be the best, which could be part of the reason you are having trouble with the queries. For example (and without knowing the full context of the tables and what you are doing) surely each geometric point on a road would have an associated postcode. There's a whole process that goes into creating a set of tables that works well in a relational database, its more than just grouping like things together.

    If you run the ST_Distance query on its own

    Code:
    SELECT (ROUND((ST_Distance(r.geom, po.geom))::NUMERIC, 1) AS distance_m
    FROM env.roads r, env.postcode po
    WHERE r.classifica = 'A Road';
    Are you getting more results back than you expect?
     

Share This Page