Connect to Azure Databases from R
R has great options for connecting to SQL databases through the DBI for R project. RStudio also has great documentation on Databases using R, in particular the section “Best Practices”. The major lines of this post is based on the RStudio documentation, but I’ve added some details that I spent time resolving.
When connecting to SQL databases in Azure I use the DBI package and the odbc package. As explained in the “Best Practices” above, it is not desirable to enter database information and credentials directly in a connection string. That is, don’t do like this:
con <- DBI::dbConnect(odbc::odbc(),
server = "<server name>.database.windows.net",
database = "<database name>",
username = "<user>",
password = "<password>",
port = 1433
)
Windows
On Windows I use the “ODBC Data Source Administrator”:
A DSN with name “MyConnection” is then used in R as
con <- DBI::dbConnect(odbc::odbc(), dsn = "MyConnection")
You can test the connection with DBI::dbIsValid(con)
.
Linux
On Linux (which is relevant when using Shiny Server) I rely on the Free TDS drivers. On Ubuntu these are installed by running the following commands:
apt-get install unixodbc unixodbc-dev --install-suggests
apt-get install tdsodbc
The ODBC drivers are specified in the file /etc/odbcinst.ini
:
[FreeTDS]
Description = FreeTDS
Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so
Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so
Trace = no
Driver
and Setup
files may be in different locations on other systems, but at least the name of the shared libraries should be the same.
A connection is specified in the file /etc/odbc.ini
:
[MyConnection]
Driver = FreeTDS
Server = <server name>.database.windows.net
Database = <database name>
Port = 1433
Encrypt = yes
TDS_Version = 7.0
Timeout = 60
I can only make a connection work when TDS_Version
is 7.0
– using 7.x
with x
different from 0 has not worked for me.
In Azure one can make Azure Active Directory the only allowed form of authentication. So if you are having problems accessing a database, check if access with username & password is allowed.
You can also enter Username
and Password
in odbc.ini
, but there is no point: They are not read by R. That is, you need a connection string like:
con <- DBI::dbConnect(odbc::odbc(), dsn = "MyConnection", username = "<username>", password = "<password>")
.Renviron
file:
db_username = "<username>"
db_password = "<password>"
Sys.getenv("db_username")
and Sys.getenv("db_password")
.
(Check the references for Renviron
in a previous post.)
Differences between Linux and Windows
One thing that has caused me problems is that a SQL script passed to a SQL database in Azure through R is not executed in the same manner on Windows and Linux (with the setup I have described).
As a simple example, I often use SQL Server Management Studio (SSMS) on Windows to write and test SQL scripts (when the dbplyr package doesn’t suffice).
In SSMS the tables in a database are listed as <prefix>.<tablename>
.
In R, the command DBI::dbListTables(con)
returns the tables without the <prefix>
.
When executing a SQL script through DBI::dbSendQuery
with <prefix>
it works on Windows, but not on Linux.