Mobile app version of vmapp.org
Login or Join
Gail5422790

: Serving Images out of SQL server vs. File system vs. S3 etc My application (classic asp yay!) has about 2.1 million images @ 25GB and that only represents 90 days of data and I'd like to

@Gail5422790

Posted in: #Images

My application (classic asp yay!) has about 2.1 million images @ 25GB and that only represents 90 days of data and I'd like to go 365 at a minimum. I need to get these under control and am considering all options. What are your thoughts on the pros and cons of the following practices:


SQL Server
Pros: Easy to back up
Cons: Performance ?
File System
Pros: Speed
Cons: Redundancy, Back up is slow (currently researching doing Synthetic full back ups instead which might make that better)
S3 and the like
Pros: Bandwidth is shifted from my datacenter to Amazon, virtually unlimited storage.
Cons: Cost, Cost Analysis is tricky (estimating 80% of my bandwidth is images for ROI purposes), Difficult/Costly to swtich service providers should that become necessary


Does anyone else deal with the multi-million image challenge and how did you address it?

10.05% popularity Vote Up Vote Down


Login to follow query

More posts by @Gail5422790

5 Comments

Sorted by latest first Latest Oldest Best

 

@Goswami781

Ignore people who say, "Don't store images/binary data in the database" as they are basing their answers on old information (assuming you will be storing the data in a VarBinary type column). The performance concerns using SQL Server to store images can now be mitigated by using the FILESTREAM data type in SQL Server 2008. In essence, the FILESTREAM data type allows you to combine the ease of storing data in the database with the performance you get from serving files from a NTFS file store.

To quote SQL Mag:


"SQL Server 2008’s new FILESTREAM
support combines the benefit of
accessing LOBs directly from the NTFS
file system with the referential
integrity and ease of access offered
by the SQL Server relational database
engine."


For more info read this blog by Ravi S.Maniam on MSDN.

10% popularity Vote Up Vote Down


 

@Nickens628

Databases are designed for transactional data / consistency and security.

Media files (images, audio, video) tend to be created and maybe deleted, but very rarely updated. So generally there's no need to keep them transactionally consistent with other data and a database won't give you any real benefit there. Text content maybe a different matter.

As long as you don't have any problem with the concept of someone pulling your file directly if they have the file's URL, then a file system is fine. If you were running something like a photo library, where you expect to charge before people download the file, then that is probably a different matter. That is, once a user has paid, they may get a URL specific to that user or valid for only a short time, and the application handles multiple or temporary URLs pointing to the same image. That could still be handled by the app and a file system, but you end up serving the media through the application rather than as a straight file download (which would mostly rule out any benefits of S3) and there's less difference between DB and file system.

10% popularity Vote Up Vote Down


 

@Murphy175

If you decide to go with storing them in the file system, you might want to read up on this ServerFault question for some do's and don't's: Storing a million images in the filesystem.

10% popularity Vote Up Vote Down


 

@Samaraweera270

We don't have millions of images, but do have hundreds of thousands, and we use the hybrid approach - mysql for metadata, images stored on local disk for backup, and pushed to Amazon s3 where they are served to users. We've had no trouble with Amazon and availability. Moving to cloudfront is in our plans, just need to find the time.

This discussion may be helpful to you in your decision: ask.metafilter.com/59635/Millions-of-images
I would go with metadata in SQL server and files on the filesystem (or s3 or cloudfront). But the best answer depends on some other usage patterns:


do the images change often
can you serve the images directly from the filesystem (that is, img src="..." ) or do you need them to be access controlled. If the latter, then a database solution is best
are you serving a small number of images most of the time (the most recent 10%) or is the distribution relatively widespread.


Backups for millions of images are going to be complicated no matter how you arrange them - it's just a lot of data. I'd want to find a good case study on backing up blobs in SQL server before I committed to that solution. (Here's an article that might be useful: www.databasejournal.com/features/mssql/article.php/3738276/Storing-Images-and-BLOB-files-in-SQL-Server-Part-4.htm)

10% popularity Vote Up Vote Down


 

@Welton855

While I don't deal with the multi-million image challenge, I would use Amazon CloudFront. It all the files are stored in an S3 bucket but are server through Amazon's content delivery system. I would not use S3 alone.

My second choice would be file system. Simple and easy, only problem is if all these files end up in one directory the whole thing will crash, hard.

SQL to me would not be an option for a system like this. Not only are you getting charged for bandwidth transfer you will also be charged for the processing of the query --this will very depending on hosting, but I assume that you are using a dedicated server or at the least a vps where you will be charged for cycles. Then it will slow down your entire site if it uses the same database as the image server. If not then you add all this complexity of having to manage two database connections.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme