google-cloud-data-fusion

google cloud data fusion - 使用MapReduce将SQL Server数据库映射到MySQL数据库时获取空指针异常

发布于 2020-07-01 18:22:54

我是Cloud Data Fusion的新手,正在尝试将SQL Server数据库中的表映射到MySQL数据库。我已经解决了很多问题,即:

  • 固定的服务帐户权限,以便它可以访问所需的所有资源;
  • 为我的SQL Server中允许的连接添加IP;
  • 我正在使用system.profile.properties.dataproc:dataproc.conscrypt.provider.enable = false来防止另一个问题中报告的SSL错误问题。

经过最后一次修复后,我现在尝试在io.cdap.cdap.internal.app.runtime.ProgramControllerServiceAdapter#97-MapReduceRunner-phase-1上处理MapReduce作业上的NULL指针异常。

Data Fusion提供的堆栈跟踪如下:

java.lang.NullPointerException: null
at io.cdap.plugin.db.batch.source.AbstractDBSource.loadSchemaFromDB(AbstractDBSource.java:138) ~[na:na]
at io.cdap.plugin.db.batch.source.AbstractDBSource.loadSchemaFromDB(AbstractDBSource.java:155) ~[na:na]
at io.cdap.plugin.db.batch.source.AbstractDBSource.prepareRun(AbstractDBSource.java:241) ~[na:na]
at io.cdap.plugin.db.batch.source.AbstractDBSource.prepareRun(AbstractDBSource.java:68) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.lambda$prepareRun$0(WrappedBatchSource.java:51) ~[na:na]
at io.cdap.cdap.etl.common.plugin.Caller$1.call(Caller.java:30) ~[na:na]
at io.cdap.cdap.etl.common.plugin.StageLoggingCaller.call(StageLoggingCaller.java:40) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:50) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:36) ~[na:na]
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.lambda$prepareRun$2(SubmitterPlugin.java:71) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext$2.run(AbstractContext.java:551) ~[na:na]
at io.cdap.cdap.data2.transaction.Transactions$CacheBasedTransactional.finishExecute(Transactions.java:224) ~[na:na]
at io.cdap.cdap.data2.transaction.Transactions$CacheBasedTransactional.execute(Transactions.java:211) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:546) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:534) ~[na:na]
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.prepareRun(SubmitterPlugin.java:69) ~[na:na]
at io.cdap.cdap.etl.batch.PipelinePhasePreparer.prepare(PipelinePhasePreparer.java:111) ~[na:na]
at io.cdap.cdap.etl.batch.mapreduce.MapReducePreparer.prepare(MapReducePreparer.java:97) ~[na:na]
at io.cdap.cdap.etl.batch.mapreduce.ETLMapReduce.initialize(ETLMapReduce.java:192) ~[na:na]
at io.cdap.cdap.api.mapreduce.AbstractMapReduce.initialize(AbstractMapReduce.java:109) ~[na:na]
at io.cdap.cdap.api.mapreduce.AbstractMapReduce.initialize(AbstractMapReduce.java:32) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1.initialize(MapReduceRuntimeService.java:182) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1.initialize(MapReduceRuntimeService.java:177) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambda$initializeProgram$1(AbstractContext.java:640) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:600) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.initializeProgram(AbstractContext.java:637) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.beforeSubmit(MapReduceRuntimeService.java:547) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.startUp(MapReduceRuntimeService.java:226) ~[na:na]
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:47) ~[com.google.guava.guava-13.0.1.jar:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) [na:na]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_212]

任何帮助将不胜感激。

谢谢。

PS:解决此问题后,我现在遇到了:

java.lang.ClassCastException: io.cdap.plugin.db.DBRecord cannot be cast to io.cdap.plugin.db.DBRecord
at io.cdap.plugin.db.batch.source.AbstractDBSource.transform(AbstractDBSource.java:267) ~[database-commons-1.2.0.jar:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.lambda$transform$2(WrappedBatchSource.java:69) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.Caller$1.call(Caller.java:30) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.StageLoggingCaller.call(StageLoggingCaller.java:40) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.transform(WrappedBatchSource.java:68) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.transform(WrappedBatchSource.java:36) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.TrackedTransform.transform(TrackedTransform.java:74) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:44) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:32) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.PipeStage.consume(PipeStage.java:44) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.PipeTransformExecutor.runOneIteration(PipeTransformExecutor.java:43) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.mapreduce.TransformRunner.transform(TransformRunner.java:142) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.mapreduce.ETLMapReduce$ETLMapper.map(ETLMapReduce.java:230) ~[cdap-etl-batch-6.0.1.jar:na]
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) [hadoop-mapreduce-client-core-2.8.5.jar:na]
at io.cdap.cdap.internal.app.runtime.batch.MapperWrapper.run(MapperWrapper.java:135) [na:na]
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) [hadoop-mapreduce-client-core-2.8.5.jar:na]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) [hadoop-mapreduce-client-core-2.8.5.jar:na]
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) [hadoop-mapreduce-client-app-2.8.5.jar:na]
at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_212]
at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_212]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) [hadoop-common-2.8.5.jar:na]
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) [hadoop-mapreduce-client-app-2.8.5.jar:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_212]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_212]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_212]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_212]
at io.cdap.cdap.internal.app.runtime.batch.distributed.MapReduceContainerLauncher.launch(MapReduceContainerLauncher.java:114) [io.cdap.cdap.cdap-app-fabric-6.0.1.jar:na]
at org.apache.hadoop.mapred.YarnChild.main(Unknown Source) [hadoop-mapreduce-client-app-2.8.5.jar:na]

PS 2:解决了上述问题后,我现在可以迁移表了。但是,有时我会收到以下stacktrace作为警告,然后迫使作业结束。在实际失败之前,作业会自行重复(不知道这是否是默认行为)。另外,似乎它无法将太多行写入目标数据库或连接丢失。这使我无法迁移特定的表。知道为什么吗?

com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after connection closed. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:404) at com.mysql.jdbc.Util.getInstance(Util.java:387) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:917) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:896) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:885) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860) at com.mysql.jdbc.ConnectionImpl.throwConnectionClosedException(ConnectionImpl.java:1246) at com.mysql.jdbc.ConnectionImpl.checkClosed(ConnectionImpl.java:1241) at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4564) at io.cdap.plugin.db.batch.sink.ETLDBOutputFormat$1.close(ETLDBOutputFormat.java:90) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:670) at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:2021) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:797) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at io.cdap.cdap.internal.app.runtime.batch.distributed.MapReduceContainerLauncher.launch(MapReduceContainerLauncher.java:114) at org.apache.hadoop.mapred.YarnChild.main(Unknown Source)

谢谢!

查看更多

提问者
Pedromlm
被浏览
16
Edwin Elia 2019-07-13 05:29

对于第二个错误,您遇到了类加载错误:https : //issues.cask.co/browse/CDAP-15636

有关此问题的解决方法,请尝试使用通用数据库源和接收器,而不是产品特定的数据库。配置应大致相同。