python - arcpy + multiprocessing: error: could not save Raster dataset -
i'm doing number crunching on raster arcpy
, want use multiprocessing
package speed things up. basically, need loop through list of tuples, raster calculations using each tuple , write outputs files. inputs consist of data raster (bathymetry), raster defines zones, , tuple of 2 floats (water surface elevation, depth). procedure consists of function computeplane
takes tuple , runs series of raster calculations produce 5 rasters (total, littoral, surface, subsurface, profundal), , calls function processtable
each of rasters write values dbf using arcpy.sa.zonalstatisticsastable
, add fields using arcpy.addfield_management
, convert dbf csv using arcpy.tabletotable_conversion
, , delete dbf file using arcpy.delete_management
.
based on other posts, i've wrapped code in main()
multiprocessing.pool
supposedly play nice. use main()
create set of tuples , pool.map
multiprocessing. use tempfile
package pick names dbf file avoid name conflicts; csv file names guaranteed not conflict other threads.
i have tested code loop , works fine, when try use pool.map
get
runtimeerror: error 010240: not save raster dataset c:\users\michael\appdata\local\esri\desktop10.4\spatialanalyst\lesst_ras output format grid.
what happening here? error not show in non-multiprocess version of code, , don't write out rasters anywhere---then again, don't know how arcpy
deals intermediate rasters (i sure don't think keeps them in memory though---they large). need tell arcpy
way handles raster calculations multiprocessing work? i've included python file below.
import arcpy arcpy.checkoutextension("spatial") import arcpy.sa import numpy import multiprocessing import tempfile bathymetry_path = r'c:/gis workspace/rre/habitat.gdb/bathymetry_ngvd_meters' zones_path = r'c:/gis workspace/rre/habitat.gdb/markerzones_meters' table_folder = r'c:/gis workspace/rre/zonetables' bathymetry = arcpy.sa.raster(bathymetry_path) zones = arcpy.sa.raster(zones_path) def processtable(raster_obj, zone_file, w, z, out_folder, out_file): temp_name = "/" + next(tempfile._get_candidate_names()) + ".dbf" arcpy.sa.zonalstatisticsastable(zone_file, 'value', raster_obj, out_folder + temp_name, "data", "sum") arcpy.addfield_management(out_folder + temp_name, "wse", 'text') arcpy.addfield_management(out_folder + temp_name, "z", 'text') arcpy.calculatefield_management(out_folder + temp_name, "wse", "'" + str(w) + "'", "python") arcpy.calculatefield_management(out_folder + temp_name, "z", "'" + str(z) + "'", "python") arcpy.tabletotable_conversion(out_folder + temp_name, out_folder, out_file) arcpy.delete_management(out_folder + temp_name) def computeplane(wsedepth): wse = wsedepth[0] depth = wsedepth[1] total = bathymetry < depth littoral = total & ((wse - bathymetry) < 2) surface = total & ~(littoral) & ((total + wse - depth) < (total + 2)) profundal = total & ((total + wse - depth) > (total + 5)) subsurface = total & ~(profundal | surface | littoral) # zonal statistics table names total_name = 'total_w' + str(wse) + '_z' + str(depth) + '.csv' littoral_name = 'littoral_w' + str(wse) + '_z' + str(depth) + '.csv' surface_name = 'surface_w' + str(wse) + '_z' + str(depth) + '.csv' subsurface_name = 'subsurface_w' + str(wse) + '_z' + str(depth) + '.csv' profundal_name = 'profundal_w' + str(wse) + '_z' + str(depth) + '.csv' # compute zonal statistics processtable(total, zones, wse, depth, table_folder, total_name) processtable(littoral, zones, wse, depth, table_folder, littoral_name) processtable(surface, zones, wse, depth, table_folder, surface_name) processtable(profundal, zones, wse, depth, table_folder, profundal_name) processtable(subsurface, zones, wse, depth, table_folder, subsurface_name) def main(): watersurface = numpy.arange(-15.8, 2.7, 0.1) # take small subset of tuples: watersurface[33:34] wsedepths = [(watersurface[x], watersurface[y]) x in range(watersurface.size)[33:34] y in range(watersurface[0:x+1].size)] pool = multiprocessing.pool() pool.map(computeplane, wsedepths) pool.close() pool.join() if __name__ == '__main__': main()
update
some more sleuthing reveals isn't multiprocessing
problem issue way arcgis raster processing. raster algebra results written files in default workspace; in case had not specified folder arcpy
writing rasters appdata folder. arcgis uses basic names lessth, lessth_1, etc. based on algebra expression is. since didn't specify workspace, multiprocessing
threads writing folder. while single arcpy
process can keep track of names, multiple processes all try write same raster names , bump other process' locks.
i tried creating random workspace (file gdb) @ beginning of computeplane
, deleting @ end, arcpy
doesn't release lock in timely manner , process crashes on delete statement. i'm not sure how proceed.
well, solution error 010240
use arcpy.gp.rastercalculator_sa
funtion instead of arcpy.sa
write rasters specified output names. unfortunately, after re-implementation ran
fatal error (infadi) missing directory
which described in stackexchange post. recommendations in post specify different workspace in each function call, first thing tried (without success). there's discussion 1 raster can written out @ time, there's no point mutiprocessing raster calculations anyway. give up!
Comments
Post a Comment