Documenting behavior for aggregations that can't be calculated #96

smmaurer · 2018-07-16T19:24:55Z

It looks like pandana.Network.aggregate() returns values of -1 for source nodes where an aggregation can't be calculated, for example if there aren't any other nodes within the distance radius. I can't find a reference to this in the documentation, though. We should confirm what the behavior is and make it more explicit.

Docstrings for pandana.Network.aggregate(): https://github.com/UDST/pandana/blob/master/pandana/network.py#L274-L320

Sphinx documentation: http://udst.github.io/pandana/network.html#pandana.network.Network.aggregate

There are several code conditions in the C++ that produce values of -1, but I haven't traced out the details: https://github.com/UDST/pandana/blob/master/src/accessibility.cpp

The text was updated successfully, but these errors were encountered:

smmaurer · 2018-07-30T20:06:48Z

Related to this are the messages about dropped rows that sometimes show up when you run an aggregation calculation:

Computing pop_500_walk
Removed 189769 rows because they contain missing values

These messages are generated by the pandana.Network.set() call that links the values being aggregated to the network.

https://github.com/UDST/pandana/blob/master/pandana/network.py#L235

Here's what happens, for the example of aggregating a variable from the households table:

if filters are provided, these rows are removed from the households table first
then, pandana tries to link each remaining row to a network node
any rows that are either missing a node id or have a NaN in the column being aggregated are dropped
the message refers to rows dropped from the households table, not nodes dropped from the network

Often, the rows are dropped because they can't be matched to nodes (for example households that are not assigned to buildings and thus don't have a spatial location), not because of missing values in the data column.

Rows that are explicitly filtered out aren't counted, which can result in variations in the number of rows dropped for aggregations in the same table.

Here is a notebook where we dug into this: More-aggregation-troubleshooting.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documenting behavior for aggregations that can't be calculated #96

Documenting behavior for aggregations that can't be calculated #96

smmaurer commented Jul 16, 2018

smmaurer commented Jul 30, 2018

Documenting behavior for aggregations that can't be calculated #96

Documenting behavior for aggregations that can't be calculated #96

Comments

smmaurer commented Jul 16, 2018

smmaurer commented Jul 30, 2018