org.apache.thrift.transport.TTransportException error while Reading large JSON file in zeppelin scala -


i trying read large json file (1.5 gb) using zeppelin , scala.

zeppelin working on spark in local mode installed on ubuntu os on vm 10 gb ram. have alloted 8gb spark.executor.memory

my code below

val inputfileweather="/home/shashi/incubator-zeppelin-master/data/ai/weather.json" val temp=sqlcontext.read.json(inputfileweather) 

i getting following error

org.apache.thrift.transport.ttransportexception     @ org.apache.thrift.transport.tiostreamtransport.read(tiostreamtransport.java:132)     @ org.apache.thrift.transport.ttransport.readall(ttransport.java:86)     @ org.apache.thrift.protocol.tbinaryprotocol.readall(tbinaryprotocol.java:429)     @ org.apache.thrift.protocol.tbinaryprotocol.readi32(tbinaryprotocol.java:318)     @ org.apache.thrift.protocol.tbinaryprotocol.readmessagebegin(tbinaryprotocol.java:219)     @ org.apache.thrift.tserviceclient.receivebase(tserviceclient.java:69)     @ org.apache.zeppelin.interpreter.thrift.remoteinterpreterservice$client.recv_interpret(remoteinterpreterservice.java:241)     @ org.apache.zeppelin.interpreter.thrift.remoteinterpreterservice$client.interpret(remoteinterpreterservice.java:225)     @ org.apache.zeppelin.interpreter.remote.remoteinterpreter.interpret(remoteinterpreter.java:229)     @ org.apache.zeppelin.interpreter.lazyopeninterpreter.interpret(lazyopeninterpreter.java:93)     @ org.apache.zeppelin.notebook.paragraph.jobrun(paragraph.java:229)     @ org.apache.zeppelin.scheduler.job.run(job.java:171)     @ org.apache.zeppelin.scheduler.remotescheduler$jobrunner.run(remotescheduler.java:328)     @ java.util.concurrent.executors$runnableadapter.call(executors.java:471)     @ java.util.concurrent.futuretask.run(futuretask.java:262)     @ java.util.concurrent.scheduledthreadpoolexecutor$scheduledfuturetask.access$201(scheduledthreadpoolexecutor.java:178)     @ java.util.concurrent.scheduledthreadpoolexecutor$scheduledfuturetask.run(scheduledthreadpoolexecutor.java:292)     @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145)     @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)     @ java.lang.thread.run(thread.java:745) 

the error got due problem in running spark interpreter, zeppelin not connect interpreter process.

you have check logs located in /path/to/zeppelin/logs/*.out know happening. perhaps in interpreter logs see oom.

i think 8gb executor memory on vm 10 gb unreasonable,(and how many executors starting?). have consider driver memeory well


Comments

Popular posts from this blog

javascript - How to get current YouTube IDs via iMacros? -

c# - Maintaining a program folder in program files out of date? -

emulation - Android map show my location didn't work -