Warm tip: This article is reproduced from serverfault.com, please click

python-AWS GlueJob中的args getResolvedOptions()方法的替代输入源是什么?

(python - What can be alternate source of input for args getResolvedOptions() method in AWS GlueJob?)

发布于 2020-06-01 11:06:28

我有一个胶水工作,我想在其中传递参数getResolvedOptions我知道的一种方法是通过在Lambda函数中创建JobRun,我可以通过它。什么其他的方式来传递param1,并param2在下面的代码:

import sys
from awsglue.utils import getResolvedOptions

args = getResolvedOptions(sys.argv, ['param1',
                                     'param2'])

注意:我不想通过硬编码在代码中传递参数。

提前致谢。

Questioner
Rajnish kumar
Viewed
0
pkarfs 2020-06-01 21:17:14

你可以通过cloudformation(cfn)yaml模板轻松实现此目的,或者也可以仅通过cli / sdk / console等将变量直接添加到作业中。如果你想沿用cfn路线,则可以按以下方式定义资源:

  JobNAME:
    Type: "AWS::Glue::Job"
    Properties:
      Name: String
      Description: String
      Role: String
      GlueVersion: 1.0
      Command: 
        Name: "glueetl"
        ScriptLocation: String
        PythonVersion: 3
      DefaultArguments: {
          "--job-language": "python",
          "--param1" : VALUE,
          "--param2" : VALUE,
          "--TempDir" : String,
          "--job-bookmark-option" : "job-bookmark-enable",
          "--enable-continuous-cloudwatch-log" : "false",
          "--enable-continuous-log-filter" : "false",
          "--enable-metrics" : "false"
      }
      ExecutionProperty:
        MaxConcurrentRuns: 1
      MaxCapacity: 5
      MaxRetries: 1
      Timeout: 60

定义后,你可以通过getResolvedOptions调出参数,注意存在胶合默认值的保留值,例如:

import sys
from awsglue.utils import getResolvedOptions

## @params: [JOB_NAME <--default assigned, param1 <---your value, param2 <---your value]
args = getResolvedOptions(sys.argv, ['JOB_NAME', 'param1','param2'])