Skip to content

Commit

Permalink
added country dictionary
Browse files Browse the repository at this point in the history
  • Loading branch information
benjbaron committed Oct 1, 2018
1 parent ad6ea71 commit b9508e2
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion GeoNames Cities Pipeline.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@
" StructField(\"AC2\", StringType()),\n",
" StructField(\"AC3\", StringType()),\n",
" StructField(\"AC4\", StringType()),\n",
" StructField(\"POP\", StringType()),\n",
" StructField(\"POP\", IntegerType()),\n",
" StructField(\"ALT\", StringType()),\n",
" StructField(\"DEM\", StringType()),\n",
" StructField(\"TZ\", StringType()),\n",
Expand Down Expand Up @@ -307,6 +307,9 @@
"# User-defined function to transform an array of strings into a string (eq. str.join).\n",
"array_to_string_udf = udf(lambda x: \",\".join(x))\n",
"\n",
"# Get the countries present in the Postal Codes dataframe\n",
"countries = df_pc_filtered.groupby('CC').count().toPandas().set_index('CC').T.to_dict()\n",
"\n",
"print(\"Computing the join...\")\n",
"for cc, count in countries.items(): \n",
" print(\"[.] %s: Processing...\" % cc)\n",
Expand Down

0 comments on commit b9508e2

Please sign in to comment.