Warm tip: This article is reproduced from stackoverflow.com, please click
google-cloud-data-fusion

Getting Null Pointer Exception when mapping SQL Server Database to MySQL Database with MapReduce

发布于 2020-06-27 05:07:34

I am new to Cloud Data Fusion and am trying to map tables in a SQL Server Database to a MySQL Database. I have already faced many issues which I managed to solve namely:

  • Fixed permissions for the service account so it could access all the resources it required;
  • Added IP to the allowed connections in my SQL Server;
  • Am using system.profile.properties.dataproc:dataproc.conscrypt.provider.enable = false to prevent SSL bug issue as reported in another question.

After this last fix, I am now trying to deal with a NULL pointer exception on the MapReduce job at io.cdap.cdap.internal.app.runtime.ProgramControllerServiceAdapter#97-MapReduceRunner-phase-1.

The stacktrace provided by Data Fusion is as follows:

java.lang.NullPointerException: null
at io.cdap.plugin.db.batch.source.AbstractDBSource.loadSchemaFromDB(AbstractDBSource.java:138) ~[na:na]
at io.cdap.plugin.db.batch.source.AbstractDBSource.loadSchemaFromDB(AbstractDBSource.java:155) ~[na:na]
at io.cdap.plugin.db.batch.source.AbstractDBSource.prepareRun(AbstractDBSource.java:241) ~[na:na]
at io.cdap.plugin.db.batch.source.AbstractDBSource.prepareRun(AbstractDBSource.java:68) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.lambda$prepareRun$0(WrappedBatchSource.java:51) ~[na:na]
at io.cdap.cdap.etl.common.plugin.Caller$1.call(Caller.java:30) ~[na:na]
at io.cdap.cdap.etl.common.plugin.StageLoggingCaller.call(StageLoggingCaller.java:40) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:50) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:36) ~[na:na]
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.lambda$prepareRun$2(SubmitterPlugin.java:71) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext$2.run(AbstractContext.java:551) ~[na:na]
at io.cdap.cdap.data2.transaction.Transactions$CacheBasedTransactional.finishExecute(Transactions.java:224) ~[na:na]
at io.cdap.cdap.data2.transaction.Transactions$CacheBasedTransactional.execute(Transactions.java:211) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:546) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:534) ~[na:na]
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.prepareRun(SubmitterPlugin.java:69) ~[na:na]
at io.cdap.cdap.etl.batch.PipelinePhasePreparer.prepare(PipelinePhasePreparer.java:111) ~[na:na]
at io.cdap.cdap.etl.batch.mapreduce.MapReducePreparer.prepare(MapReducePreparer.java:97) ~[na:na]
at io.cdap.cdap.etl.batch.mapreduce.ETLMapReduce.initialize(ETLMapReduce.java:192) ~[na:na]
at io.cdap.cdap.api.mapreduce.AbstractMapReduce.initialize(AbstractMapReduce.java:109) ~[na:na]
at io.cdap.cdap.api.mapreduce.AbstractMapReduce.initialize(AbstractMapReduce.java:32) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1.initialize(MapReduceRuntimeService.java:182) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1.initialize(MapReduceRuntimeService.java:177) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambda$initializeProgram$1(AbstractContext.java:640) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:600) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.initializeProgram(AbstractContext.java:637) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.beforeSubmit(MapReduceRuntimeService.java:547) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.startUp(MapReduceRuntimeService.java:226) ~[na:na]
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:47) ~[com.google.guava.guava-13.0.1.jar:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) [na:na]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_212]

Any help would be deeply appreciated.

Thanks.

P.S.: After solving this issue, I am now running into this:

java.lang.ClassCastException: io.cdap.plugin.db.DBRecord cannot be cast to io.cdap.plugin.db.DBRecord
at io.cdap.plugin.db.batch.source.AbstractDBSource.transform(AbstractDBSource.java:267) ~[database-commons-1.2.0.jar:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.lambda$transform$2(WrappedBatchSource.java:69) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.Caller$1.call(Caller.java:30) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.StageLoggingCaller.call(StageLoggingCaller.java:40) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.transform(WrappedBatchSource.java:68) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.transform(WrappedBatchSource.java:36) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.TrackedTransform.transform(TrackedTransform.java:74) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:44) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:32) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.PipeStage.consume(PipeStage.java:44) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.PipeTransformExecutor.runOneIteration(PipeTransformExecutor.java:43) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.mapreduce.TransformRunner.transform(TransformRunner.java:142) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.mapreduce.ETLMapReduce$ETLMapper.map(ETLMapReduce.java:230) ~[cdap-etl-batch-6.0.1.jar:na]
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) [hadoop-mapreduce-client-core-2.8.5.jar:na]
at io.cdap.cdap.internal.app.runtime.batch.MapperWrapper.run(MapperWrapper.java:135) [na:na]
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) [hadoop-mapreduce-client-core-2.8.5.jar:na]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) [hadoop-mapreduce-client-core-2.8.5.jar:na]
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) [hadoop-mapreduce-client-app-2.8.5.jar:na]
at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_212]
at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_212]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) [hadoop-common-2.8.5.jar:na]
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) [hadoop-mapreduce-client-app-2.8.5.jar:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_212]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_212]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_212]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_212]
at io.cdap.cdap.internal.app.runtime.batch.distributed.MapReduceContainerLauncher.launch(MapReduceContainerLauncher.java:114) [io.cdap.cdap.cdap-app-fabric-6.0.1.jar:na]
at org.apache.hadoop.mapred.YarnChild.main(Unknown Source) [hadoop-mapreduce-client-app-2.8.5.jar:na]

P.S. 2: After solving the aforementioned issues, I am now capable of migrating the tables. However, I am sometimes getting the following stacktrace as a warning which then forces the job to end. Before actually failing, the job repeats itself (don't know if this is default behaviour or not). Also, it seems like it either cannot write so many rows to the destination database or the connection is lost. This is holding me back from migrating specific tables. Any idea as to why?

com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after connection closed. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:404) at com.mysql.jdbc.Util.getInstance(Util.java:387) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:917) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:896) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:885) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860) at com.mysql.jdbc.ConnectionImpl.throwConnectionClosedException(ConnectionImpl.java:1246) at com.mysql.jdbc.ConnectionImpl.checkClosed(ConnectionImpl.java:1241) at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4564) at io.cdap.plugin.db.batch.sink.ETLDBOutputFormat$1.close(ETLDBOutputFormat.java:90) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:670) at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:2021) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:797) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at io.cdap.cdap.internal.app.runtime.batch.distributed.MapReduceContainerLauncher.launch(MapReduceContainerLauncher.java:114) at org.apache.hadoop.mapred.YarnChild.main(Unknown Source)

Thanks!

Questioner
Pedromlm
Viewed
20
Edwin Elia 2019-07-13 05:29

For the second error, you are encountering a classloading bug: https://issues.cask.co/browse/CDAP-15636

For a workaround to this, please try to use the generic Database source and sink instead of the product specific databases. The configurations should be mostly the same.