Saturday, November 9, 2013

Seven Easy Steps to Move Photo Archive to SQL Server using FileStream (My pocket Big Data project).

I have million of photos in my archive.
I would like to gather internal photo information from all of them into a database and build some BI reporting.

That would be absolutely impossible to do in manual mode.

I tried to do it earlier using BULK inserts and C# programming, but every time I had to load new pictures I had to do some changes in the process. It wasn't easy task for me to automate it.

So, I decided to try SQL Server FileStream functionality for that loading.

In this article I will describe all steps I went through to get my pictures inside of SQL Server.
(All steps are extremely important to follow)

Step 1 - Configuring FileStream on your PC.

A. Start "SQL Server Configuration Manager". Choose your instance of SQL Server and click properties.

B.Select FILESTREAM tab

C. Enable All options and come up with Shared name of the folder where you will keep your files.

D. Click "OK".

E. On keyboard press Win-R and run "Compmgmt.msc" application and verify creation of shared resource

Step 2 - Map drive to FileStream Folder.


On keyboard press Win-R and run "cmd" command. In the window type following command and hit "Enter":
NET USE <DRIVE LETTER>:  \\<YOUR SERVER NAME>\<SHARE NAME>

You can choose any free drive letter for your mapped drive. In  my case it was "Z".
Here is my example:
net use Z: \\SLAVA-PRO\SQLServerFileStream

Do not close that window we will re-use it.

Step 3 - Configuring your SQL Server for FileStream.

A. Connect to your SQL Server and run following command:


USE Master;

GO

EXEC sp_configure;

GO






If your Advanced properties not enabled you have to see something like this:
As you can see, your "filestream access level" option is set to "1"

B. Run following script to Enable Filestream Access:



USE Master;

GO

EXEC sp_configure filestream_access_level, 2;

RECONFIGURE

GO



C. Run same script as in (A) to verify that change is successful:
As you can see values have changed.

Step 4 - Create and prepare a database.

A. Run following script to create a new database (change your DB name if you want):




USE Master;

GO

IF EXISTS (
  SELECT TOP 1 1 FROM sys.databases 
  WHERE name = 'MyPhotoDB' 
)

  BEGIN

    ALTER DATABASE MyPhotoDB  
    SET RESTRICTED_USER WITH ROLLBACK IMMEDIATE;

    DROP DATABASE MyPhotoDB;

  END

GO

CREATE DATABASE MyPhotoDB;



B. Change your DB to Simple Recovery mode (if it isn't and if you wish):



USE Master;

GO

ALTER DATABASE MyPhotoDB SET RECOVERY SIMPLE

GO



C. Now you have to add new file group to your DB and add a new file to that group.
(You must specify EXACT file path for your new folder)



USE Master;

GO

ALTER DATABASE MyPhotoDB  
ADD FILEGROUP FileStreamGroup CONTAINS FILESTREAM;

GO

ALTER DATABASE MyPhotoDB

ADD FILE (

NAME = N'MyPhotoFileStream',

FILENAME = N'D:\MyPhotoFileStream'

) TO FILEGROUP FileStreamGroup

GO


D. To verify that new folder has been created run following command in your CMD window (which you opened in step #2) (Use your folder name):
dir d:\MyPhotoFileStream*
(you need asterisk to see parent folder)
Here is my result for that request:
Folder Exists. It means everything has been created correctly.


Step 5 - Set Database FileStream Access.

A. Run following script to capture pre-existing conditions:



SELECT DB_NAME ( database_id ) AS [DB_Name],

directory_name,

non_transacted_access,

non_transacted_access_desc

FROM sys.database_filestream_options
WHERE database_id = DB_ID('MyPhotoDB');

You have to get something like this:

B. Now run following script to set FileStream access for our DataBase:


ALTER DATABASE MyPhotoDB

SET FILESTREAM (

NON_TRANSACTED_ACCESS = FULL,

DIRECTORY_NAME = N'MyPhotoStream'

);

C. To confirm DataBase is  set re-run script from (A) and you have to get the following result:

Step 6 - Create a File table for your files.

A. Create a table by following script
(Do not forget to change your current Database from Master to the new one).


USE MyPhotoDB

GO

CREATE TABLE tbl_My_Photos AS FileTable

WITH (

FileTable_Directory = 'MyPhotoStream',

FileTable_Collate_Filename = database_default

);

B. Try to access new table by following script:

SELECT * FROM tbl_My_Photos;
You have to get something like this:
 

Step 7 - Copy your files.


A. In Explorer (Win-E) - Go to your set of files or set of folders with files. Select them and Copy them in the buffer:

B. Browse to your Z: drive (or the drive name you set in Step #2) and go inside of "Z:\MyPhotoStream\MyPhotoStream" folder and paste your files/folders there

C. After copying is done, return to SSMS and check your new table again by following script:

SELECT t.name,

t.cached_file_size as "File Size",

t.creation_time as "Added to DB",

file_stream.GetFileNamespacePath() as Full_Path

FROM tbl_My_Photos as t;

I have the following:

Wow! All my from four folders now in SQL Database and I can query them!


Extra Effort.

To show what I can do, I've created a function that extracts some basic info from my pictures:
SELECT t.name,

t.cached_file_size as "File Size",

t.creation_time as "Added to DB",

d.[Camera Make],

d.[Camera Model],

d.[Lens],

d.[Camera Date],

d.[Digitized Date],

d.[Modified Date],

d.[Software]

FROM tbl_My_Photos as t

CROSS APPLY dbo.fn_Decode_Canon(file_stream) as d;

Here is a result:
There is a lot of room for improvements of my function, such as decoding Exposure time, aperture, Focal Length, ISO, White Balance etc., that would be topic for another blog post.

 

Checklist:

1. Configure FileStream on your PC in "SQL Server Configuration Manager";
2. Create mapped drive to access SQL Server FileStreeam shared folder;
3. Configure SQL Server for FileStream;
4. Create and prepare a Database to accept external files;
5. Configure DataBase's FileStream Access;
6. Create File Table to hold data files.
7. Copy your data files!

Full Script for Steps 3-6:

-- STEP 1. Configure FileStream on your PC in "SQL Server Configuration Manager";

-- STEP 2. Create mapped drive to access SQL Server FileStreeam shared folder;
-- NET USE Z: \\SLAVA-PRO\SQLServerFileStream

-- STEP 3.  Configure SQL Server for FileStream;
USE Master;
GO
EXEC sp_configure filestream_access_level, 2;
RECONFIGURE
GO
/*
USE Master;
GO
EXEC sp_configure;
GO
*/

-- STEP 4. Create and prepare a Database to accept external files;
USE Master;
GO
IF EXISTS (SELECT TOP 1 1 FROM sys.databases WHERE name = 'MyPhotoDB')
BEGIN
  ALTER DATABASE MyPhotoDB SET RESTRICTED_USER WITH ROLLBACK IMMEDIATE;
  DROP DATABASE MyPhotoDB;
END
GO
CREATE DATABASE MyPhotoDB;
GO
ALTER DATABASE MyPhotoDB ADD FILEGROUP FileStreamGroup CONTAINS FILESTREAM ;
GO
ALTER DATABASE MyPhotoDB ADD FILE ( NAME = N'MyPhotoFileStream', FILENAME = N'D:\MyPhotoFileStream') TO FILEGROUP FileStreamGroup
GO
/*
-- Or You can use following script to create DB with FILESTREAM
CREATE DATABASE MyPhotoDB  ON
PRIMARY ( NAME = MyPhotoDB, FILENAME = 'D:\SQL2012\Data\MyPhotoDB.mdf'),
FILEGROUP FileStreamGroup CONTAINS FILESTREAM( NAME = MyPhotoFileStream,
    FILENAME = 'D:\MyPhotoFileStream')
LOG ON  ( NAME = MyPhotoDBLog, FILENAME = 'D:\SQL2012\Data\MyPhotoDB.ldf')
GO
*/

-- Step 5. Configure DataBase's FileStream Access;
USE Master;
GO
ALTER DATABASE MyPhotoDB SET FILESTREAM ( NON_TRANSACTED_ACCESS = FULL, DIRECTORY_NAME = N'MyPhotoStream' )
GO
/*
SELECT DB_NAME ( database_id ) AS [DB_Name], directory_name, non_transacted_access, non_transacted_access_desc
FROM sys.database_filestream_options
WHERE database_id = DB_ID('MyPhotoDB');
*/

-- Step 6. Create File Table to hold data files.
USE MyPhotoDB
GO
CREATE TABLE tbl_Photo_Gallery AS FileTable
 WITH ( FileTable_Directory = 'Photo_Gallery', FileTable_Collate_Filename = database_default);
GO

-- Step 7. Copy data files!
/*
SELECT t.name,
t.cached_file_size as "File Size",
t.creation_time as "Added to DB",
file_stream.GetFileNamespacePath() as Full_Path
FROM tbl_My_Photos as t;
*/
 



No comments:

Post a Comment