Warm tip: This article is reproduced from serverfault.com, please click

apache spark-从PySpark连接到MSSQL

(apache spark - Connect to MSSQL from PySpark)

发布于 2020-11-29 10:18:59

我正在尝试使用spark.read.jdbcPySpark连接到MS SQL DB

import os
from pyspark.sql import *
from pyspark.sql.functions import *
from pyspark import SparkContext;
from pyspark.sql.session import SparkSession
sc = SparkContext('xx')
spark = SparkSession(sc)

    spark.read.jdbc('DESKTOP-XXXX\SQLEXPRESS',
"""(select COL1, COL2 from tbl1 WHERE COL1 = 2) """,
properties={'user': sa, 'password': 12345, 'driver': xxxx})

我不知道sc = SparkContext('xx')'driver': xxxx哪些参数我应该通过?

Questioner
Beso
Viewed
11
mck 2020-11-29 18:27:49

替换serveraddress为你的数据库地址:

sc = SparkContext()
spark = SparkSession(sc)
spark.read \
     .format('jdbc') \
     .option('url', 'jdbc:sqlserver://serveraddress:1433') \
     .option('user', 'sa') \
     .option('password', '12345') \
     .option('dbtable', '(select COL1, COL2 from tbl1 WHERE COL1 = 2)')