i'm doing number crunching on raster arcpy , want use multiprocessing package speed things up. basically, need loop through list of tuples, raster calculations using each tuple , write outputs files. inputs consist of data raster (bathymetry), raster defines zones, , tuple of 2 floats (water surface elevation, depth). procedure consists of function computeplane takes tuple , runs series of raster calculations produce 5 rasters (total, littoral, surface, subsurface, profundal), , calls function processtable each of rasters write values dbf using arcpy.sa.zonalstatisticsastable, add fields using arcpy.addfield_management, convert dbf csv using arcpy.tabletotable_conversion, , delete dbf file using arcpy.delete_management.

based on other posts, i've wrapped code in main() multiprocessing.pool supposedly play nice. use main() create set of tuples , pool.map multiprocessing. use tempfile package pick names dbf file avoid name conflicts; csv file names guaranteed not conflict other threads.

i have tested code loop , works fine, when try use pool.map get

runtimeerror: error 010240: not save raster dataset c:\users\michael\appdata\local\esri\desktop10.4\spatialanalyst\lesst_ras output format grid.

what happening here? error not show in non-multiprocess version of code, , don't write out rasters anywhere---then again, don't know how arcpy deals intermediate rasters (i sure don't think keeps them in memory though---they large). need tell arcpy way handles raster calculations multiprocessing work? i've included python file below.

import arcpy arcpy.checkoutextension("spatial") import arcpy.sa import numpy import multiprocessing import tempfile bathymetry_path = r'c:/gis workspace/rre/habitat.gdb/bathymetry_ngvd_meters' zones_path = r'c:/gis workspace/rre/habitat.gdb/markerzones_meters' table_folder = r'c:/gis workspace/rre/zonetables' bathymetry = arcpy.sa.raster(bathymetry_path) zones = arcpy.sa.raster(zones_path)  def processtable(raster_obj, zone_file, w, z, out_folder, out_file):   temp_name = "/" + next(tempfile._get_candidate_names()) + ".dbf"   arcpy.sa.zonalstatisticsastable(zone_file, 'value', raster_obj, out_folder + temp_name, "data", "sum")   arcpy.addfield_management(out_folder + temp_name, "wse", 'text')   arcpy.addfield_management(out_folder + temp_name, "z", 'text')   arcpy.calculatefield_management(out_folder + temp_name, "wse", "'" + str(w) + "'", "python")   arcpy.calculatefield_management(out_folder + temp_name, "z", "'" + str(z) + "'", "python")   arcpy.tabletotable_conversion(out_folder + temp_name, out_folder, out_file)   arcpy.delete_management(out_folder + temp_name)  def computeplane(wsedepth):   wse = wsedepth[0]   depth = wsedepth[1]   total = bathymetry < depth   littoral = total & ((wse - bathymetry) < 2)   surface = total & ~(littoral) & ((total + wse - depth) < (total + 2))   profundal = total & ((total + wse - depth) > (total + 5))   subsurface = total & ~(profundal | surface | littoral)   # zonal statistics table names   total_name = 'total_w' + str(wse) + '_z' + str(depth) + '.csv'   littoral_name = 'littoral_w' + str(wse) + '_z' + str(depth) + '.csv'   surface_name = 'surface_w' + str(wse) + '_z' + str(depth) + '.csv'   subsurface_name = 'subsurface_w' + str(wse) + '_z' + str(depth) + '.csv'   profundal_name = 'profundal_w' + str(wse) + '_z' + str(depth) + '.csv'   # compute zonal statistics   processtable(total, zones, wse, depth, table_folder, total_name)   processtable(littoral, zones, wse, depth, table_folder, littoral_name)   processtable(surface, zones, wse, depth, table_folder, surface_name)   processtable(profundal, zones, wse, depth, table_folder, profundal_name)   processtable(subsurface, zones, wse, depth, table_folder, subsurface_name)  def main():   watersurface = numpy.arange(-15.8, 2.7, 0.1)   # take small subset of tuples: watersurface[33:34]     wsedepths = [(watersurface[x], watersurface[y]) x in range(watersurface.size)[33:34] y in range(watersurface[0:x+1].size)]   pool = multiprocessing.pool()   pool.map(computeplane, wsedepths)   pool.close()   pool.join()  if __name__ == '__main__':   main()

update

some more sleuthing reveals isn't multiprocessing problem issue way arcgis raster processing. raster algebra results written files in default workspace; in case had not specified folder arcpy writing rasters appdata folder. arcgis uses basic names lessth, lessth_1, etc. based on algebra expression is. since didn't specify workspace, multiprocessing threads writing folder. while single arcpy process can keep track of names, multiple processes all try write same raster names , bump other process' locks.

i tried creating random workspace (file gdb) @ beginning of computeplane , deleting @ end, arcpy doesn't release lock in timely manner , process crashes on delete statement. i'm not sure how proceed.

well, solution error 010240 use arcpy.gp.rastercalculator_sa funtion instead of arcpy.sa write rasters specified output names. unfortunately, after re-implementation ran

fatal error (infadi) missing directory

which described in stackexchange post. recommendations in post specify different workspace in each function call, first thing tried (without success). there's discussion 1 raster can written out @ time, there's no point mutiprocessing raster calculations anyway. give up!

Search This Blog

Addrety

python - arcpy + multiprocessing: error: could not save Raster dataset -

update

Comments

Post a Comment

Popular posts from this blog

javascript - Feed FileReader from server side files -

python - Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>] error -

php - Webix Data Loading from Laravel Link -