温馨提示:本文翻译自stackoverflow.com，查看原文请点击：azure - DATABRICKS DBFS

azure databricks file system

azure - 数据砖DBFS

发布于 2020-04-11 12:46:11

我需要对Databricks DBFS有所了解。

用简单的基本术语来说，它是什么，它的目的是什么，它允许我做什么？

关于数据块的文档说明了这一点。

“ DBFS中的文件会持久保存到Azure Blob存储中，因此即使终止群集，也不会丢失数据。”

任何见解都将是有帮助的，但是从架构和使用角度来看，找不到能够深入了解该文档的文档

提问者

Billy B

被浏览

123

查看英文版

查看原文

Eva 2019-02-25 20:57

我有使用DBFS的经验，它是一个很好的存储设备，可以保存您可以使用DBFS CLI从本地计算机上载的数据！该CLI设置有点复杂，但是当你管理，你可以很容易地围绕在这个环境中移动整个文件夹（记得使用-overwrite！）

创建文件夹
上传文件
修改，删除文件和文件夹

使用Scala，您可以使用以下代码轻松提取存储在该存储中的数据：

val df1 = spark
      .read
      .format("csv")
      .option("header", "true")
      .option("inferSchema", "true")
      .load("dbfs:/foldername/test.csv")
      .select(some_column_name)

或读入整个文件夹以处理所有csv可用文件：

val df1 = spark
      .read
      .format("csv")
      .option("header", "true")
      .option("inferSchema", "true")
      .load("dbfs:/foldername/*.csv")
      .select(some_column_name)

我认为它易于使用和学习，希望此信息对您有所帮助！

Billy B 2019-02-26 21:12:45

感谢您提供的Eva，它非常有帮助，感谢您花费大量时间和精力进行详细说明

相关问题

1

无法将docker-compose基础结构部署到Azure容器实例

2

我应该使用哪个.NET Azure Service Bus库作为队列？

3

从Microsoft Azure中的设备孪生获取来自后端服务应用程序的消息

4

如何通过暴露的api检索cosmosdb的查询统计信息？

5

Blazor WebAssembly应用程序上加载请求B2C登录屏幕时出错

6

Xamarin形式

7

避免从功能应用程序的应用程序见解中记录严重性级别0

8

谁可以创造？

9

如何使用Azure CLI设置VNET / SubNet？

10

外部类型：网关无法访问虚拟网络内的资源吗？

热门github

1

A command line tool and library for transferring data with URL syntax, supporting DICT, FILE, FTP, FTPS, GOPHER, GOPHERS, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET, TFTP, WS and WSS. libcurl offers a myriad of powerful features (翻译：Curl 是一个命令行工具，用于传输使用 URL 语法指定的数据。)

2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

3

Flutter makes it easy and fast to build beautiful apps for mobile and beyond (翻译：Flutter 可以轻松快速地为移动设备及其他应用构建漂亮的应用程序)

4

Powerful menu bar manager for macOS

5

Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.

6

AI coding agent, built for the terminal.

7

Tongyi DeepResearch, the Leading Open-source DeepResearch Agent

8

An AI Hedge Fund Team

9

TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.

10

基于大模型和 RAG 的智能问数系统。Text-to-SQL Generation via LLMs using RAG.

11

🔥 🔥 🔥 Open Source Airtable Alternative (翻译：将任何 MySQL、PostgreSQL、SQL Server、SQLite 和 MariaDB 转换为智能电子表格。)

12

Lightweight coding agent that runs in your terminal

13

Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI

14

Home of the WebKit project, the browser engine used by Safari, Mail, App Store and many other applications on macOS, iOS and Linux. (翻译：WebKit 项目的主页，Safari、Mail、App Store 和 macOS、iOS 和 Linux 上的许多其他应用程序使用的浏览器引擎。)

15