問(wèn)題描述
我正在使用 Apache Spark 1.5.1 并嘗試連接到名為 clinton.db
的本地 SQLite 數(shù)據(jù)庫(kù).從數(shù)據(jù)庫(kù)表創(chuàng)建數(shù)據(jù)框工作正常,但是當(dāng)我對(duì)創(chuàng)建的對(duì)象執(zhí)行某些操作時(shí),我收到以下錯(cuò)誤消息,其中顯示SQL 錯(cuò)誤或丟失的數(shù)據(jù)庫(kù)(連接已關(guān)閉)".有趣的是,我還是得到了手術(shù)的結(jié)果.知道我可以做些什么來(lái)解決問(wèn)題,即避免錯(cuò)誤嗎?
I am using Apache Spark 1.5.1 and trying to connect to a local SQLite database named clinton.db
. Creating a data frame from a table of the database works fine but when I do some operations on the created object, I get the error below which says "SQL error or missing database (Connection is closed)". Funny thing is that I get the result of the operation nevertheless. Any idea what I can do to solve the problem, i.e., avoid the error?
spark-shell 的啟動(dòng)命令:
Start command for spark-shell:
../spark/bin/spark-shell --master local[8] --jars ../libraries/sqlite-jdbc-3.8.11.1.jar --classpath ../libraries/sqlite-jdbc-3.8.11.1.jar
從數(shù)據(jù)庫(kù)中讀取:
val emails = sqlContext.read.format("jdbc").options(Map("url" -> "jdbc:sqlite:../data/clinton.sqlite", "dbtable" -> "Emails")).load()
簡(jiǎn)單計(jì)數(shù)(失敗):
emails.count
錯(cuò)誤:
15/09/30 09:06:39 WARN JDBCRDD:異常結(jié)束語(yǔ)句java.sql.SQLException: [SQLITE_ERROR] SQL 錯(cuò)誤或缺少數(shù)據(jù)庫(kù)(連接已關(guān)閉)在 org.sqlite.core.DB.newSQLException(DB.java:890)在 org.sqlite.core.CoreStatement.internalClose(CoreStatement.java:109)在 org.sqlite.jdbc3.JDBC3Statement.close(JDBC3Statement.java:35)在 org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.org$apache$spark$sql$execution$datasources$jdbc$JDBCRDD$$anon$$close(JDBCRDD.scala:454)在 org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1$$anonfun$8.apply(JDBCRDD.scala:358)在 org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1$$anonfun$8.apply(JDBCRDD.scala:358)在 org.apache.spark.TaskContextImpl$$anon$1.onTaskCompletion(TaskContextImpl.scala:60)在 org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:79)在 org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:77)在 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)在 scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)在 org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:77)在 org.apache.spark.scheduler.Task.run(Task.scala:90)在 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)在 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)在 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)在 java.lang.Thread.run(Thread.java:745)res1:長(zhǎng) = 7945
推薦答案
我遇到了同樣的錯(cuò)誤 今天,并且重要的一行就在異常之前:
I got the same error today, and the important line is just before the exception:
15/11/30 12:13:02 INFO jdbc.JDBCRDD:關(guān)閉連接
15/11/30 12:13:02 INFO jdbc.JDBCRDD: closed connection
15/11/30 12:13:02 WARN jdbc.JDBCRDD:異常結(jié)束語(yǔ)句java.sql.SQLException: [SQLITE_ERROR] SQL 錯(cuò)誤或缺少數(shù)據(jù)庫(kù)(連接已關(guān)閉)在 org.sqlite.core.DB.newSQLException(DB.java:890)在 org.sqlite.core.CoreStatement.internalClose(CoreStatement.java:109)在 org.sqlite.jdbc3.JDBC3Statement.close(JDBC3Statement.java:35)在 org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.org$apache$spark$sql$execution$datasources$jdbc$JDBCRDD$$anon$$close(JDBCRDD.scala:454)
15/11/30 12:13:02 WARN jdbc.JDBCRDD: Exception closing statement java.sql.SQLException: [SQLITE_ERROR] SQL error or missing database (Connection is closed) at org.sqlite.core.DB.newSQLException(DB.java:890) at org.sqlite.core.CoreStatement.internalClose(CoreStatement.java:109) at org.sqlite.jdbc3.JDBC3Statement.close(JDBC3Statement.java:35) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.org$apache$spark$sql$execution$datasources$jdbc$JDBCRDD$$anon$$close(JDBCRDD.scala:454)
所以Spark成功關(guān)閉JDBC連接,然后關(guān)閉JDBC語(yǔ)句
So Spark succeeded to close the JDBC connection, and then it fails to close the JDBC statement
看源碼,close()
被調(diào)用了兩次:
第 358 行(org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD,Spark 1.5.1)
Line 358 (org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD, Spark 1.5.1)
context.addTaskCompletionListener{ context => close() }
第 469 行
override def hasNext: Boolean = {
if (!finished) {
if (!gotNext) {
nextValue = getNext()
if (finished) {
close()
}
gotNext = true
}
}
!finished
}
如果您查看 close()
方法(第 443 行)
If you look at the close()
method (line 443)
def close() {
if (closed) return
您可以看到它檢查了變量 closed
,但該值從未設(shè)置為 true.
you can see that it checks the variable closed
, but that value is never set to true.
如果我沒(méi)看錯(cuò)的話,這個(gè)bug還在master里面.我已提交錯(cuò)誤報(bào)告.
If I see it correctly, this bug is still in the master. I have filed a bug report.
- 來(lái)源:JDBCRDD.scala(行號(hào)略有不同)
- Source: JDBCRDD.scala (lines numbers differ slightly)
這篇關(guān)于SQLITE_ERROR:通過(guò) JDBC 從 Spark 連接到 SQLite 數(shù)據(jù)庫(kù)時(shí),連接已關(guān)閉的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!