Tom Coughlin, President, Coughlin Associates, Inc.
Alex Grossman, CEO, Symply, Inc.
Adrian “AJ” Herrera, Vice President of Marketing, Caringo, Inc.
Erik Weaver, Luminary of the Future of M & E Storage, HGST, a Western Digital Brand
James DeRuvo, Editor-in-Chief, DoddleNEWS
Male Voiceover: The Digital Production Buzz is brought to you by KeyFlow Pro, media asset management software, designed to meet the needs of work groups at an affordable price.
Larry Jordan: Tonight on the Buzz, we are talking about object storage. We’ve been using it for years, but now, it’s migrating to the desktop. Tonight, we’ll learn what it is, how it works and why media creators and distributors need to pay attention to it.
Larry Jordan: We start with Tom Coughlin, the President of Tom Coughlin Associates. He presents a background on what object storage is, why it was designed and who it’s for. Next, Alex Grossman, the President of Symply Inc, explains how object storage can enable media creators to do more with their storage and how object storage can be used on the desktop.
Larry Jordan: Next, Caringo was one of the first companies creating object storage software. Tonight Adrian Herrera, VP of Marketing for Caringo, tells us about their company, their products and how they are used in media distribution. Next, Erik Weaver is a luminary of the future, of media and entertainment for HGSD, who talks with us about why we need to start to change our storage solutions. All this plus James DeRuvo with our weekly doddleNEWS update. The Buzz starts now.
Male Voiceover: Since the dawn of digital filmmaking. Authoritative: One show serves a worldwide network of media professionals. Current: Uniting industry experts. Production: Filmmakers. Post-production: And content creators around the planet. Distribution: From the media capital of the world in Los Angeles, California, the Digital Production Buzz goes live now.
Larry Jordan: Welcome to the Digital Production Buzz, the world’s longest running podcast for the creative content industry, covering media production, post-production and marketing around the world. HI, my name is Larry Jordan. Tonight, we’re talking about object storage. Let me start by saying that some of these discussions are highly technical and not always easy to understand. Object storage is a new way of storing, tracking, accessing and sharing files that forms the foundation of all Cloud computing, and now it’s migrating to the desktop.
Larry Jordan: Object storage is a computer data storage architecture that manages data as objects, as opposed to other storage architectures like file systems, which manage data as a file hierarchy, folder and file, or block storage, which manages data as blocks within sectors and tracks on a hard disk. The reason we want to cover this subject this week is that object storage is moving from the massive servers in the Cloud, to private Clouds used by the enterprise and now appearing in the smaller workgroups of media production and distribution.
Larry Jordan: Because media creates files that are both massive and numerous, it’s time for all of us to better understand this technology. Even if you only have 50 to 100 terabytes of media in your storage pool, object storage may be worth considering. And helping you better understand what’s going on is what tonight’s show is all about.
Larry Jordan: Before we start, however, I want to invite you to subscribe to our free weekly show newsletter at digitalproductionbuzz.com. Every issue, every week, provides quick links to the different segments on the show, plus articles of interest to filmmakers. And, best of all, it’s free and comes out every Saturday. Now it’s time for our weekly doddleNEWS update with James DeRuvo. Hello James.
James DeRuvo: Hello Larry
Larry Jordan: So, what’s in the news this week?
James DeRuvo: Well, this week, RED is returning to the final frontier with a new low-light sensor. These guys just can’t help but build stuff. It’s a new low-light Super 35mm Sensor that fits somewhere in between the Helium Super 35 Sensor that they have and that Full Frame Vista Vision MONSTRO Sensor and it has an additional two stops of dynamic range over the Helium. It was designed to practically image in complete darkness. It was made for a very “special customer” for work in deep space.
Larry Jordan: Well, who do you think that is?
James DeRuvo: It could be NASA, but I think the conventional wisdom is, is that RED has built this new sensor for Elon Musk’s SpaceX Corporation and their upcoming launch of their new Falcon Heavy Mars Rocket. It’s a great story. They’re going to launch it in a couple of days and they need some ballast to get the payload right and so, Elon Musk is putting his Tesla convertible built into the rocket and it’s going to shoot it up into Outer Space and I think he wants to be able to capture it on camera. They made this special low-light sensor for work in Outer Space.
James DeRuvo: The downside is, they only were allowed to make five additional sensors that they can sell, so, anyone that does any extreme low-light video can get one of these, but it’s not going to be cheap.
Larry Jordan: RED is pushing the final frontier. What else is happening this week?
James DeRuvo: Blackmagic, just today, updated their URSA Mini and is turning it into a broadcast beast. They want to make an affordable 4K ultra high definition broadcast quality camera that has 12 stops of dynamic range, supports both analogue and newer digital B4 lenses, plus the option of adding the EF mount or the PL mount. They’ve also announced a new 4K broadcast switcher and updated ATEM 4 M/E switcher and you can turn your old 2 M/E switcher into a 4 M/E switcher with a free software update.
Larry Jordan: What’s your take on the announcement?
James DeRuvo: The updated URSA Mini can now be used as a broadcast camera, capturing in 4K and full HD, but it can also work as a field camera, supporting B4 as well as cinematic lenses in EF and PL mounts, so it’s really two cameras in one. That’ll give a shooter an additional revenue stream and it’s under 4,000 bucks. It’s a bargain for … looking to boost their income.
Larry Jordan: Okay, that’s Blackmagic. What’s our third story this week?
James DeRuvo: Facebook continues to go to the dark side. They’re experimenting with vertical video. The social giant has discovered that more users are watching video on their mobile devices; something like 97% of Facebook users will watch video on Facebook using a smartphone. They’re more like to watch a vertical video, because it’s just easier to hold their phone that way. They’ve commissioned 15 content creators from around the world to make vertical videos, in order to start pushing this new kind of genre and format.
Larry Jordan: I love your image of Facebook going to the dark side. What’s your real opinion of this format?
James DeRuvo: You know, I understand the allure of vertical video, you know, because more and more people are ingesting their content with a mobile device. But, in my opinion, that tall and skinny aspect ratio completely ruins a video presentation by making it with these thick black bars on either side. I’m just not a fan of it and I don’t think that it’s going to have a future beyond mobile. But if you want to get a kick, go to YouTube and do a search for vertical video syndrome. It’s an hysterical video that kind of points fun at the whole craze.
Larry Jordan: James, what other stories are you following this week?
James DeRuvo: Other stories we’re following this week include, Canon could lockdown future cameras and lenses with a fingerprint ID system. Zoom has a great new mini field recorder and there’s a ton of new firmware updates.
Larry Jordan: Where can we find all these stories?
James DeRuvo: All these stories and more can be found at doddlenews.com.
Larry Jordan: James DeRuvo is the Editor-in-Chief of doddleNEWS and joins us every week. James, thank you so much, we’ll talk to you next Thursday.
James DeRuvo: Alright, see you then.
Larry Jordan: When you can’t find your media, you need a media asset management solution. KeyFlow Pro. This simple but powerful software is designed specifically to help you organize, track and find your media. Whether you work alone, or part of a group, its intuitive user interface helps you easily store, sort, search, play, annotate and share your media, using team based shared libraries over a network. Its wide range of features are all at a very affordable price and, with the new 1.8.3 update, rescanning is up to ten times faster. Plus, KeyFlow Pro is integrated with Mac O.S. notifications, enabling you to collaborate faster and smarter, all in real time.
Larry Jordan: KeyFlow Pro is available at a Mac App store, or get a 30 day free trial at keyflowpro.com. KeyFlow Pro, simple, elegant and surprisingly affordable.
Larry Jordan: Tom Coughlin is a Silicon Valley Consultant, a Storage Analyst, a senior member of the IEEE and the organizer of the annual Storage Visions and Creative Storage Conferences. Hello Tom, welcome back.
Tom Coughlin: Hi there, it’s good to be back.
Larry Jordan: Tonight we’re taking a closer look at object storage. How would you define what this is?
Tom Coughlin: I think in order to understand object storage, you need to have a little bit of background on some of the other more common storage architectures. We’ll start with what’s called block storage and the data that’s actually stored on a storage device like a hard disk drive, or a solid state drive, for example, is stored in blocks and blocks are a certain range of data that’s in one physical location on the storage device.
Tom Coughlin: Block storage is what most storage devices are basically built around, is storage in blocks. A bunch of blocks can be put together to create a file. Now a file could be a video, it could be some audio, it could be stuff like that. What you do is, you break up the data that goes into a file into little pieces called blocks and those are individually stored on the storage device.
Larry Jordan: The way that most hard disks work is they divide the storage space on the hard disk into little blocks, little cubbyholes and they drop pieces of the file in those blocks.
Tom Coughlin: That’s correct.
Larry Jordan: Within these blocks, the files are stored in blocks, and then they’re organized by the file system. Tell me what a hierarchical file system is.
Tom Coughlin: A hierarchical file system is basically a structure of different folders that contain files. Those folders may be within folders. Having something within something else is what they mean by a hierarchy.
Larry Jordan: The benefit to a hierarchical file system is it enables us to find the stuff that we stored on the hard disk?
Tom Coughlin: It’s one way of finding things that we store on the hard disk, that we store in a storage system, in this case, if we’re talking about a network attached storage device.
Larry Jordan: Okay. I’ve got that. That’s what we’re using now with our RAIDS and our hard disks?
Tom Coughlin: Most of them. There also are systems that are built around organizing things at the block level and those are called storage area networks. There’s file based storage, there’s block based storage and, if done properly, the block based storage can actually be faster because, if it has enough control of where things are stored on the individual storage devices that the system does, then it can organize them in ways that allows you, in the right circumstances, to get a very fast performance.
Tom Coughlin: Some of the highest performance storage may actually be in a storage area network, which is block based. But that has limited metadata and all of the information on putting the files together is in the application.
Larry Jordan: Okay, that’s block storage and file storage, so where does object storage fit in?
Tom Coughlin: What object storage is, instead of having this hierarchical structure of files within folders within folders, it increases and changes the kind of metadata that we use to find the content itself, to find the individual files. The way it does that is, it does not have a hierarchical structure. A basic object storage system has a very flat structure, there’s no folders within folders. As a consequence, it is capable of scaling to a much larger number of things that are accessible within the same storage architecture, which can span geographic areas, can go across multiple storage units in multiple places.
Tom Coughlin: The big advantage of the object storage is, it allows you to be able to store a lot more things. It does that by having an object. Now an object contains the data and it also contains advanced metadata concepts, which can be much richer than is allowed by say network attached storage devices with their hierarchical structure. Part of what gives object storage its value is the things you can do with that metadata, which can even include special purpose metadata, which a designer can put into that content, into an assistance use and finding it and using that content.
Tom Coughlin: Object storage was really taken and embraced by the hyper scale folks that build all the Cloud infrastructure, all the Cloud storage systems at big data centers. The biggest application they had for the object storage was, they had huge amounts of content, say Facebook with all of their photographs and they needed to store this cost-effectively. What they did is they created an object storage system with software, so it’s a software defined storage, on top of commodity hardware, i.e. the cheapest storage hardware that they could buy, that would still serve the purpose they had.
Tom Coughlin: A lot of the Cloud, basically, is object storage, the big amounts of content that are stored in the Cloud are stored as objects in object storage systems, because it’s the way they do it, it’s cost-effective. Object storage then has been good for archival applications, or where you want to keep a lot of content. It has a history of not having as high a performance as, for instance, a network attached storage system which is file based, or a storage area network which is block based. Those can have higher performance than the object storage.
Tom Coughlin: Part of that is because of that focus on as low cost a storage as you can. Object storage allows you to then scale to much larger amounts of things that can be accessed within a given storage name space, no matter where it’s stored at. It could be stored in a geographically different area and it could still be part of this object storage system, so it can scale to billions of items. It has this specialized metadata which also can be tied into applications, so there’s things that can be done within the storage system because of the specialized metadata capabilities.
Tom Coughlin: Now, if you start building object storage with equipment that is specialized on the use of object storage that is focused on higher performance, it’s possible that object storage can also make in-roads into some of these higher performance applications.
Larry Jordan: That is a hard subject to figure out, the object storage. I didn’t realize it was as technically complex as it is.
Tom Coughlin: Yes, it involves the basic architecture of how you’re storing information and what it is, it’s creating this really flat architecture. Because the hierarchy can get in your way, it can slow you down and it makes it hard to store a lot of individual things, because it gets so complicated. But this flat architecture allows you to directly access individual pieces without going through a hierarchy, so it scales more.
Larry Jordan: Can we be object storage without being Cloud based at all?
Tom Coughlin: You can be object storage without having any connection to the Internet; it can just be a local storage.
Larry Jordan: Within media and entertainment, who should pay attention to object storage? Only people that are storing stuff to the Cloud, or does it have broader implications?
Tom Coughlin: I think it has broader implications. First of all, there is all the public Cloud infrastructure which is being widely used for, for instance, content delivery. Content libraries are often basically a Cloud storage system, CDNs and that sort of thing. Also for archiving. But especially in the media and entertainments industry, there are folks that are not so comfortable with putting their valuable content into a public Cloud, so, private Cloud sort of infrastructure is another place where this can occur, or even what’s called a hybrid Cloud which contains maybe some things in a public Cloud and some things in a private Cloud, could as well benefit from these object storage type systems.
Larry Jordan: Can object storage co-exist with our existing storage, or do we have to do all one, or all the other?
Tom Coughlin: It can co-exist. In fact, an object storage system can even support a file based access. Oftentimes it’s a software defined storage architecture and it can be implemented on top of existing hardware.
Larry Jordan: Again, within media and entertainment, who should pay attention to object storage the most? In other words, who is it really targeted for?
Tom Coughlin: Well object storage was developed and is most of the storage that’s in the Cloud. I mean, there’s a growing amount of people using public Cloud services and also people implementing their own private Cloud, or some type of hybrid Cloud, which can either be their private and a public Cloud, or a private and multiple public Clouds. It’s basically becoming, I think, a universally available tool.
Tom Coughlin: In fact, as a consumer, you’re using the Cloud all the time because that’s where the applications that you run, for instance on your mobile phone, are located. It’s kind of become something that touches all of us in some way and will do so increasingly in the future, especially as people become more accountable and as it’s clear that there is security and encryption and protection of privacy leakage and that sort of thing, which is especially important for those valuable media assets in our industry.
Larry Jordan: One of the things I’ve learned, Tom, is that storage changes constantly and it is not necessarily easy to understand. Tell me about your conferences, because those could be really helpful to folks.
Tom Coughlin: I think they really could be. We do the Creative Storage Conference, which is going to be June 7th in Culver City, California, which focuses on applications and architectures of digital storage in media and entertainment. We’re looking especially to recruit speakers who are media and entertainment professionals, who have some experience with digital storage, who would like to talk about that. You know, things they like, things they’ve tried, problems they’ve had. The website that gives more information on that or to participate is creativestorage.org.
Larry Jordan: Tom, for people that want to keep track of what you and your company are doing, where can they go on the web?
Tom Coughlin: They can go to tomcoughlin.com.
Larry Jordan: That’s all one word, tomcoughlin.com and the upcoming June conference is Creative Storage at creativestorage.org and the Fall conference is Storage Visons and that’s at storagevisions.com. Tom, thanks for joining us today.
Tom Coughlin: Thank you very much for having me Larry.
Larry Jordan: Alex Grossman is the CEO of Symply. He’s a 25 year veteran of the storage industry and a former Senior Director at Apple; where he was responsible for driving the development of Apple’s server and storage products. Hello Alex, welcome.
Alex Grossman: Hi Larry, how are you today?
Larry Jordan: I am really looking forward to our conversation, because, if anybody can really explain what object storage is, it’s you. But before we do, give me a quick description of what Symply is.
Alex Grossman: Oh yes. Well we’re a storage vendor. That’s kind of crazy huh Larry? We actually do media entertainment storage, but we do it in a slightly different way. We actually build smaller workgroup storage based on common platforms, virtualized technology, storage defined. We use the backbone of StorNext, which is the Quantum product; probably the largest file system out there today in media and entertainment and it’s 100% compatible with Accent; so we do scale out workgroup clustered storage.
Larry Jordan: Well you’ve been involved with object storage for a long time and what we’ve learned from Tom is that it’s principally used today for Cloud services such as Amazon, or Dropbox. Why should we consider moving to object storage for the desktop or media production environments?
Alex Grossman: Oh, that’s a great question and, yes, I’ve had a lot of experience with it and it’s changed over the years. Really, what Tom probably talked about was how object storage is ideal for larger and long-term storage, because, you know, the Cloud vendors will handle trillions, in fact, I think, as you recently said, they had over 100 trillion objects out there. When you’re talking about large scale, there is definitely a lower cost for doing object storage and it’s also very highly protected. It’s kind of self-protecting and it’s very safe and also, from the Cloud perspective, it’s really easy to get to.
Alex Grossman: One of the challenges and part of the reason we should consider it on the desktop is, one of the challenges we have as content creators is that we have content management challenges. We all need to protect the content we have and protect it well, because, our hope is that it’s going to be worth something over time. You know, if we have the next Star Wars, or the next Gone with the Wind, think about how long we’ll have that content and how much it’s going to be worth. The idea of taking large content and only putting it up to the Cloud, it’s a great idea but it can be cost prohibitive if we ever need to bring it back and it can also take a long time to get it there.
Alex Grossman: There is technology like Snowball from Amazon, that allows us to actually take a RAID array and connect it to our SAN or our NAS infrastructure within our facility and then copy it to that RAID array, put it on the FedEx truck and move it up to the Cloud. Then Amazon will then load it up, it will go in the glacier and it will be there for a long time, which is a great thing. The problem is, you’re doing that frequently and you’re having it still in one spot controlled by someone else.
Alex Grossman: The question really is, what do you do if you want to have long-term retention of content and, in fact, before I even go there, one of the things that we see quite often is that it’s really not even about long-term retention, it’s about the ultimate goal of having a workflow where it’s seamless all the way from ingest to archive and then also it’s round-tripping, so you can come back into your work in process. That means you really need it near line, you need to access it quickly and you need it in your facility. If that’s the case, what technologies do you use to be able to do that? What’s the ultimate goal?
Alex Grossman: For the longest time, if we wanted to talk about long-term retention, or even near-line retention we used tape. I mean, as much as I might have been quoted in 1998 saying tape is dead, but, as much as we try to get rid of it, tape keeps coming back and it keeps getting cheaper. The beauty of tape is, once you write the tape and you shut off the tape system, it’s cold, so, you know, you’re not wasting power, you’re not wasting any energy at all, or requiring to cool it. The problem with tape is that, like anything else, it has to be rewritten periodically, it has to be kept very well and the access time can be kind of slow. In fact, it can be really slow if you can’t find the tape.
Alex Grossman: There’s tools that are supposed to be used to fix that, you know, media asset managers and other tools, but, the bottom line is that, what we’re looking for is we’re looking for low cost, we’re looking for fast return on investment, we’re looking for guaranteed access and this is where object storage really wins.
Larry Jordan: What Tom told us is that object storage has been traditionally used on the Cloud, especially for archiving and distribution, but performance is not its strength. How do we get around these performance issues, because performance is everything in media, especially in post and production?
Alex Grossman: Absolutely and that’s a great question Larry. The key there is knowing where to use it and I almost like to think I was one of the early pioneers in doing this. We always look at content as being online, near-line and far-line. Online is the speed that we need, for the most part, that is our workspace for media.
Alex Grossman: Most people still use SANs when they’re talking about 4K and, you know, 4K DPX and uncompressed and very high speed media, they still need SANs and fiber channel infrastructure to do that. But more and more people are looking at NAS infrastructures where you can use ten gig and 40 gig and 100 gig Ethernet and achieve fairly good performance numbers. That’s when you’re talking about ingest and work in process.
Alex Grossman: Once you get past that finishing step, now we’re into three other stages, we’re into transcoding, which you also could say is delivery, delivery and then archive. What object storage really can help us with and this is where Tom gets it as well, is that, we need to go from that work in process, that finishing step, to the delivery step, which isn’t latency dependent. Object storage can be relatively fast, the problem is, it’s the latency that kills this in production.
Alex Grossman: I don’t think we’re going to get to the point, any time soon, using very safe things like erasure coding and object storage, to allow us to do production, but when it comes to near-line and delivery and transcoding, what I coined a long time ago, which is extended online, is these steps that we still consider in our production, but that we don’t need those latency dependent applications for. But once the content is finished, we want to retain it and keep it safe and that’s what object storage really can help us with.
Larry Jordan: Let’s say that we have a workgroup that is in transcode and delivery and distribution, what’s involved in setting up the system? Does it require professional help and is it affordable by non-enterprise sized groups?
Alex Grossman: Wow, that is really the cusp of the whole thing. If you were to Google object storage and or low-cost object storage, you would see numbers starting at $50 million, starting at $5 million, starting at $100,000. The affordability exactly is true, it’s all about the enterprise access to that. When people read about object storage, they read probably the most important thing, which is, it’s easy access to the content and what I mean by that is that, it’s globally distributed, as Tom probably said. The beauty of this is, you can access essentially a web browser, so http, this rest interface, these rest APIs and we’ve made the transition to go from a file based storage system to an object based storage system virtually seamless.
Alex Grossman: That helps us to take object storage and to be able to put it on the desktop in ways that are really Cloud based, so if you use Google Drive, or iCloud, or AWS, or some people use things like Storage Made Easy, which is a gateway to go to all those clouds, or Dropbox, all these things are object storage based and they put it on the desktop.
Alex Grossman: The question is, what if I wanted to build object storage in my facility and I’ve got a small facility, I don’t have an enterprise class facility? You have to ask yourself three questions first. Firstly, do you need a petascale architecture? When you say petascale it means, really simply, how much content do I create every day, week, month, year, how many years of content do I want to keep and is that content in terabytes or is it in petabytes. If the content is somewhere about 500 terabytes or approaching a petabyte then I say, today you can build yourself object storage rather than buying object storage and you can actually do that and be cost effective and have it work for you really easily. That’s really the key, Larry, is really getting that cost down.
Alex Grossman: The third thing is, you have to look at, well what am I going to do about managing it? The beauty of object storage and why it’s so popular in the Cloud is, not only can it scale to trillions of objects, but also it abstracts the normal storage management away from you. You don’t have to worry about individual RAID drives and storage pools and RAID sets, all that stuff goes away and all you’re essentially doing is building one main space that just grows as you need it. Just keep plugging drives in.
Alex Grossman: The big benefit or reason why you can do it yourself is that, you eliminate data migration. One of the problems that we have in a production facility, no matter if it’s small or large, is we’re always buying hard drives. If I’m just one or two people working in a facility, I’m buying a lot of external hard drives and I’ve got cabinets full of them. The problem is, the power supplies are going to go, the drives are going to fail and I’m constantly worrying about that and migrating data off, or making two or three copies and it’s just not worth it.
Alex Grossman: If I’ve got a slightly bigger facility, I might have a RAID system connected to an SAN or NAS and, for the most part, we’re seeing people going back to SAN infrastructures, where they’re doing 4K and DPX and other uncompressed workflows, but at that point, you’ve got to say to yourself, well those drives and that whole infrastructure is good for three to five years.
Alex Grossman: Take something like Star Wars, that I was saying before, that came out in 1977. If I do the math, let’s say that’s 41 years ago. If I’m migrating every five years, I’ve basically moved that content out eight times. When you’ve got a terabyte to move it’s not so bad, but when you’ve got 500 terabytes or a petabyte, those eight times of moves could take three months, or six months to move, because you’ve got other things going on. You can’t migrate that data real easily.
Alex Grossman: The beauty of object storage is that, I can put it out there and because the interface is always the same, my access is the same way and, theoretically, 40 years from now I will still have http REST interface to be able to access it. I grow and migrate in place. It’s very, very cheap. But, how do you build it, that’s the question? Well, there’s a number of open source projects that people use and there’s a number of companies that have taken these open source projects, embedded those into standard off the shelf servers. They get the same resiliency out of just a few off the shelf servers and a few very inexpensive SAS arrays that you would get if you spent $1 million on a RAID system. It’s using technologies like … or Swift and you can usually find companies that are building these things, or that will help you build them. For the most part, if you’re looking at half a petabyte to a petabyte, that’s the most cost-effective way.
Larry Jordan: Where can people go to learn more about the products and the serves that your company provides?
Alex Grossman: At symply.com.
Larry Jordan: Is that gosymply or just symply?
Alex Grossman: You can use gosymply.com.
Larry Jordan: That’s gosymply.com and Alex Grossman is the CEO of Symply. Alex, thanks for joining us today.
Larry Jordan: Adrian Herrera is the VP of Marketing for Caringo. Caringo provides an object based storage platform for production, distribution and collaboration in media and entertainment industries, to solve issues associated with storing and protecting rapidly growing digital assets, while keeping them online and accessible. Hello AJ, welcome.
Adrian Herrera: Hi Larry, how are you?
Larry Jordan: Good. You know, after reading that introduction, could you describe what it is that Caringo does in English?
Adrian Herrera: Sure. We are software vendors and developers and we’ve been doing it for 12 years. We’ve always focused on object storage. You install our software on any x86 servers and create a massive saleable pool of storage. At our core, that’s what we do. I know you’re doing the series on object storage and it’s great that you’re doing it, but it’s almost a misnomer to just classify object storage as storage. We do focus a lot on content delivery, data management and search, so it’s a very rich software platform, it’s just not about storage.
Larry Jordan: Well, is object storage a thing, or is it a process? How would you describe it?
Adrian Herrera: Object storage is more of an architecture, it’s a way to approach how you store data. The analogy that a lot of us vendors like to use in the space is, it’s very similar to a valet. You know, you take your car to a valet you give them your car, they give you a ticket, you don’t really care where they park your car, you just want it to be safe and not scratched and get it back when you give them your ticket back. This is, at its simplest form, object storage. You give this storage your content and the storage gives you a key and off you go.
Larry Jordan: Help me understand because I get confused here. I take a file, I drop it on top of my RAID and my RAID stores it. I don’t care what sector it’s in, or what RAID disk it’s stored in, I just simply gave it to the RAID and it parked it. Why is this different from object storage?
Adrian Herrera: That’s because you know where the file is. You know the directory, you know the file name and what’s actually going on behind the scenes is, that file that you store is being shredded, broken into thousands of different segments, inodes and it’s being stored across the RAID sectors. That’s great when you’re in a single location, or if you know the application. But, let’s say you don’t know the application, let’s say you don’t know the end user, or if you have to deliver that piece of content over the Internet and a lot of organizations are struggling.
Adrian Herrera: When you’re dealing with file systems, you have to put a lot of layers of infrastructure to deliver. You need to know the exact location, which may be great, when you’re dealing with thousands, maybe hundreds of thousands of files, but, how about a million? How about a billion? These are scales that organizations of all size are getting to. Especially in the M&E world, we’re seeing some very unique issues that object storage solves.
Larry Jordan: Give me an example of what object storage can do that other storage systems can’t.
Adrian Herrera: You can deliver the content, the media directly from the storage layer. The native interface for most object storage solutions is TCP/IP, so you deliver it straight over the web. You can compare that to file systems, like NFS, or SMB and you have to put some layer of translation in front of it. With object storage, you can also protect content continuously.
Adrian Herrera: You mentioned RAID, which is a great example. You know, RAID was developed to protect a disk and you protect a disk by adding certain amounts of additional disk and, if something goes wrong, you have to recover not only that specific piece of content, but the entire RAID sector, or the entire disk and, you know, recovery takes a long time.
Adrian Herrera: When you think about storing data not as a sector across a file system and RAID set, but storing it as a complete object on a disk, or SSD, you can then do some very creative things from an automation perspective, very useful things. You can automatically protect it, you can move it across the entire system, so that you’re continuously optimizing the system, you can store metadata with it and we’re seeing that more often these days. It’s very easy to store, protect and access very, very large scale datasets and we’re talking hundreds of terabytes to hundreds of petabytes and millions to billions of files. It’s an efficient way to store data, or to identify that information.
Larry Jordan: That gets me directly to Caringo. How does Caringo implement object storage? What am I getting from you and what does it do?
Adrian Herrera: As I mentioned we’re software vendors, so you get our software and you can either go out, or we’ll help you size the appropriate hardware. We also have a number of partners in our own hardware plants that we can bring. You either install our software on an x86 server hardware and you can mix and match. That’s where a lot of the value comes in, you’re buying commodity servers and you are making them enterprise grade by installing our software.
Adrian Herrera: We also offer a number of interfaces and a lot of object storage vendors offer a lot of interfaces. Historically, object storage has been around for a very long time, it’s been around as long as we’ve been around, we were one of the first in the space, if not the first in the space, so it’s been around over a decade. The main interface, 12 years ago, was an API and, 12 years ago, APIs weren’t all that popular. You know, applications wanted standards, they wanted SMB, they wanted SIS, they wanted NFS and that was one of the reasons why, you know, object storage took so long to really become mainstream and it’s mainstream now.
Adrian Herrera: A lot of it has to do with the work that Amazon’s put in. Amazon’s done a tremendous job of pushing web services, AWS and particularly S3 and they’ve basically got the entire industry comfortable with developing to APIs and RESTful interfaces and that’s one of the reasons why, you know, object storage is now taking off. That in conjunction with M&E organizations needing this technology.
Adrian Herrera: We hear, over and over again, the struggles of trying to provide nearly free storage in instant access. You know, the production houses tell us that their customers, the studios, you know, when they work on projects for them, they want the production houses to store those projects indefinitely. When they come knocking on the door again, for a new project, that uses some of the same content, they need them to provide those projects and that content instantly. They don’t want to wait a day to go and grab it off the tape archives. This is a very challenging issue for production houses.
Larry Jordan: Looking at the smaller end, for people that want to start with this, what would you describe as an organization size that should consider object storage and what’s an entry price?
Adrian Herrera: Any size organization. We offer a free ten terabyte version of our software, fully functional, so, if organizations want to try it out, they can go ahead and come to our site and download it. The reason why is, certain types of object storage aren’t just good at storing a lot of data, from a capacity perspective. Some of them, like Caringo, are good at storing a lot of different files, small files, and that’s where file systems have trouble. Object storage systems, by storing the content in a more complete way, are more efficient at storing very small files, for very long periods of time.
Adrian Herrera: You know, traditionally you protect content via RAID, there’s a parity disk that you set in place. What object storage does is, it’s very similar to RAID, but it’s at the content level. At its core, that’s what object storage does, it focuses the storage on content. Traditional storage is really focused on the file system and the application’s needs. Object storage is about the content, it’s about protecting the content, providing access to the content, managing it, providing metadata around that content, so it’s easy to search against, it just simplifies the process of preservation and content delivery.
Larry Jordan: If I’m a single user, just one or two people, that need to keep the files internally, I’m thinking video editing is the closest example, object storage is not going to help me a whole lot. But if I have a wide-ranging organization, where I need to send files around to different offices, or different people, object storage would be a better use case there?
Adrian Herrera: Well it depends. If you’re a small production shop and you have hundreds of terabytes of data and, you know, you’re struggling with protecting that. You know, production studios now, they have to provide that project immediately. There is a notion of scale here. You know, we find that, if you have ten terabytes, you may be able to handle this with a NAS that you buy at price, but, once you start getting to the hundreds of terabytes level, that’s when your storage infrastructure takes a lot more care in feeding. You know, you’re not going to plug in ten NASs and, you know, keep them going all the time. You may, but that’s not going to scale effectively, so that’s when object storage really comes into play.
Adrian Herrera: But we have organizations that are really small, that are using us, we have a lot of organizations using us for, you know, tens of terabytes.
Larry Jordan: For people that want more information about Caringo, where can they go on the web?
Adrian Herrera: They can go to caringo.com. For M&E customers, we do suggest going to our M&E section, we have a lot of great information on use cases and we produce a number of education webinars, so they can go to the resource sections and check those out.
Larry Jordan: Adrian Herrera is the VP of Marketing for Caringo and, AJ, thanks for joining us today, this has been fascinating.
Adrian Herrera: Thank you Larry.
Larry Jordan: Here’s another website I want to introduce you to, doddlenews.com. doddleNEWS gives you a portal into the broadcast, video and film industries. It’s a leading online resource, presenting news, reviews and products for the film and video industry. doddleNEWS also offers a resource guide and crew management platform specifically designed for production. These digital call sheets, along with their app, directory and premium listings, provide in-depth organizational tools for busy production professionals.
Larry Jordan: doddleNEWS is a part of the Thalo Arts community, a worldwide community of artists, filmmakers and storytellers. From photography to filmmaking, performing arts to fine arts and everything in between, Thalo is filled with the resources you need to succeed. Whether you want the latest industry news, need to network with other creative professionals, or require state of the art online tools to manage your next project, there’s only one place to go, doddlenews.com.
Larry Jordan: Erik Weaver is a specialist, focused on the intersection of the Cloud and the media and entertainment industry. He’s currently creating strategy for HGST, this is a Western Digital Brand, and prior to HGST he worked with the USC Film School, to develop next generation Cloud standards to support global studios, as well as executive producing a number of award-winning films. Hello Erik, welcome.
Erik Weaver: Thank you. Glad to be here.
Larry Jordan: Well we are glad you are here; because, in listening to the other interviews today, our brains have exploded and your job is to help us put it all back together into a nice neat package. Because, tonight we’re looking at object storage. How would you define this from your perspective?
Erik Weaver: I would define object storage as the solution to the next generation of problems. I would say that, without getting too deep into things like erasure coding and the other aspects of object storage, it’s to fundamentally overcome the limitations of RAID. I’m not sure if you’re aware, but RAID basically hits a wall at about six terabytes. That means that, at six terabytes, your time to rebuild the system gets to a point in which you might have failure while rebuilding, so you may go into a permanent failure of rebuilding a drive. Anybody who’s dealt with this understands this kind of pain. Object storage is a way to scale up and out that steps over those limitations.
Larry Jordan: Is object storage hardware, am I buying an object storage box? Or is it software? Or is it a file system? What is it?
Erik Weaver: Typically, object storage is software, it’s laying over fundamentally the same drives as other things, but you can buy it either way.
Larry Jordan: Does object storage enable me to find stuff better, or does it enable me to be more efficient with my storage capacity, or what?
Erik Weaver: It doesn’t actually provide a taxonomy or ontology to your metadata, or to your data itself, to help you find it better, but it allows you to grow that data bigger. For example, right now, the new drives are going to be 40 terabytes by 2025, based on the microwave assisted technologies. Understanding how to place all that data on these disks, it allows you to find it, but it doesn’t allow you to identify it or create those tags to it. But, being able to structure that across multiple drives is absolutely critical.
Larry Jordan: This sounds like a really wonderful problem for somebody that’s well versed in IT, but why should us media people care? Because, I mean, we can barely spell hard disk.
Erik Weaver: You’ve got to care because of kind of the two things that I talked about just a second ago. The drive sizes are getting bigger and that’s what’s going to be out there. The second thing is basically the content’s getting bigger. There’s kind of a culmination of a problem. If you start off with the new cameras, for example, the Arri ALEXA is a new one that just came out that shoots at 2.6 terabytes an hour. That’s a lot of content, that stacks up very, very quickly.
Erik Weaver: At the same time, the television sets on the opposite side are now coming out in UHD 10 or the new standards around ultra-high definition. Between those two things, the middle is going to get squished to be producing that content. You’re either going to have a huge amount of data coming straight from set, or you’re going to have somebody upstream pulling you, saying I need this content in this higher resolution. The middle does really get squeezed and, whether we like it or not, we’re going to have to deal with it.
Larry Jordan: What do we need to do to implement object storage? Do I need to throw out everything I’ve got and buy all new hardware, or can I retro-fit it and how much pain am I about to go through?
Erik Weaver: That is a fairly complex question. I would reach out to an expert on that. Most object storages can be based on some standardized protocol. For us, we use S3 or the same protocol that Amazon speaks. A lot of people are already moving towards Clouds themselves, so this is basically your own on set private Cloud and, basically, that’s exactly why Google and Amazon go to it, because you can scale up and scale out for that data.
Erik Weaver: But you would really need to sit down and look at the tools that you’re using and understand if they’re S3 compatible, or what you need to do in that workflow. We also work with several different companies who create kind of that bridge, or that data mover between those different devices.
Larry Jordan: Well now I’m confused. Does that mean that all of my media is not going to be stored locally, it’s going to be stored up to the Cloud? Because, if that’s the case, my bandwidth connection to the Cloud isn’t fast enough to be able to upload this.
Erik Weaver: I wouldn’t say that. A lot of the object storage systems out nowadays are private object storage, so, it depends on how large you are. You could either have your own private object storage, which would vary in size. A lot of them start somewhat large, you know, upwards of 500 terabytes. That’s pretty big. But, once you get beyond that, it’s going to be a mixed evolution. Sometimes you’ll have things on your site or, if you’re a slightly larger production company, you might have a mixture or a co-location facility that’s called an IOA, or an exchange based format, so you would basically place those racks in a co-location facility and then cross connect directly to an Amazon or Google. It just depends on your size, or the size of the company.
Larry Jordan: I guess a bigger question is, who should even consider this? I mean, clearly enterprises and companies that are in the hundreds of thousands of terabytes, but should a small documentary workgroup consider it? If so, is there a border upon which we should say, ah I should move to object storage, or I should be okay with traditional storage?
Erik Weaver: I think it’s anybody who gets to the point of 200 terabytes and beyond. That, or if you have a long-term strategy to go towards the Cloud. Either one of those is about the point where you begin to really need to start thinking about this.
Larry Jordan: What does HGST have that helps us to get into this transition?
Erik Weaver: HGST actually has several different systems. We’re typically a little bit more towards the enterprise class, we have systems that start around 500 terabytes and go up to 52 petabytes. It’s not your everyday system, there’s a couple of other tools out there that we recommend for bridging the gap. The one I like the best is a company called Avalanche.io, they’re a wonderful tool for filmmakers. Basically, they give you a virtual desktop that shows you everything, making it look like it’s all right in front of you and it bridges between things like RP100, a G-rack, because we also own Gtech, and Amazon.
Erik Weaver: It looks at all of your storage and it’s based on something called C4, or SMPTE standard 2114 and what that is, is that’s a hashing algorithm that’s standardized. Internally, a lot of different solutions will look at things like MD5, or other different types of checksum, this is basically a standardized checksum.
Erik Weaver: This checksum is now beginning to be implemented all over the place, things like PIC systems to onset photo cam cards, or onset technicolor cards, or possibly even Colorfront. A lot of the main tools are now beginning to implement this across the board, so that you know, absolutely, that you have the same file and how many copies of that file and how do you relate metadata to that file. It’s a bit of a topic, but it’s a very important evolution in standardization for the community.
Larry Jordan: Well it sounds like, if I can summarize what you’ve been saying, if we’re one person working on a locally attached device, we don’t really need to worry about object storage immediately, but as our number of people in our workgroup expands, as our storage capacity expands, and as we need to send files outside a local workgroup, object storage is something we need to consider. Have I summarized that correctly?
Erik Weaver: Absolutely. That’s a great summary. This is not something for everybody, it’s a little bit more for large enterprise systems. You do need to understand the fundamental characteristics that touch Cloud and why that’s important. It’s all S3 typically driven, because that’s the protocol between the two and most object storage systems, and that there are tools out there, so you do need to understand what the tools are to move between different layers of storage.
Erik Weaver: In the future, we’ve kind of gone from your standardized breakdowns of say an NAS/SAN object, then it goes to software defined, right? But software defined basically has a heavy onus of integration that’s very challenging to keep up with. But the third phase, kind of coming around the corner, is really that phase in which there’s a light coupling and easy way to connect with different software applications and protocols, which is via things like C4. It helps create a structure to understand exactly where all your data is, at any particular time, and then create a Rosetta zone for the different resources.
Larry Jordan: For people who want more information about what HGST offers for object storage, where can they go on the web?
Erik Weaver: HGST.com/media.
Larry Jordan: That’s HGST.com/media and Erik Weaver is a specialist focusing on the intersection of the Cloud and the media entertainment industry. A luminary for the media and entertainment industry at HGST. Erik, thanks for joining us today.
Erik Weaver: Thank you so much for having me.
Larry Jordan: I’ve been thinking a lot about storage recently. Here at my office, I have about 200 terabytes of storage capacity, tracking just a bit more than 550,000 files and, even though I, or someone in my team created most of them, I’m still having a hard time finding the right file at the right time. Every week, between my training and the Buzz, I create another 50 to 100 files that I need to store and track. Adding to the challenge, I’m still accessing files that I created ten, even 15 years ago and I know, from all my emails, that I’m not unique.
Larry Jordan: That was one of the driving reasons I had in wanting to do a show on object storage, I wanted to learn whether I could use it. Object storage was first described in 1995, with active development starting around 2000. It is at the heart of virtually every file stored on the web, from music on Spotify, to files on Dropbox and everything on Amazon. It seeks to solve the inherent limitations of traditional hierarchical file systems, such as Windows or HFS Plus, which limit the number of files that we can store in a folder, or on a hard disk, or in a RAID.
Larry Jordan: Additionally, while RAIDs protect us from a hard disk crash, object storage can protect us from a file crash, where a media file becomes corrupt and can’t be played. Objects contain additional descriptive properties, which can be used for better indexing, or file management, as well, we don’t need to spend time setting up RAIDs or managing hard disks. Object storage also allows the addressing and identification of individual objects by more than just file name and file path, we no longer have to worry about two files erasing each other, because they have the same name.
Larry Jordan: The problem is, that currently, object storage systems are expensive and difficult to implement. That’s what companies like Symply and HGST are trying to change. By now, we know that media production, editing and distribution are only going to make the file management problem worse. Object storage is not for everyone, at least not yet, but our current system is rapidly running out of room to expand. As Erik Weaver said, at some point, we will need to change how we store our files.
Larry Jordan: Learning more about object storage gives us a better understanding of our current problems and helps us plan the transition to whatever we’ll be using in the future. Just something I’m thinking about.
Larry Jordan: I want to thank our guests for this week, Tom Coughlin of Coughlin and Associates, Alex Grossman of Symply Inc, AJ Herrera of Caringo, Erik Weaver of HGST, and James DeRuvo with doddleNEWS. There’s a lot of history in our industry and it’s all posted to our website at digitalproductionbuzz.com. Here you’ll find thousands of interviews all online and all available to you today. And remember to sign up for our free weekly show newsletter that comes out every Saturday.
Larry Jordan: Talk with us on Twitter and Facebook at digitalproductionbuzz.com. Our theme music is composed by Nathan Dugi-Turner; with additional music provided by smartsound.com. Text Transcripts are provided by Take 1 Transcription, visit take1.tv to learn how they can help you. Our Producer is Debbie Price, my name is Larry Jordan and thanks for listening to the Digital Production Buzz.
Larry Jordan: The Digital Production Buzz is copyright 2018 by Thalo LLC.
Announcer: The Digital Production Buzz was brought to you by KeyFlow Pro. A simple, but powerful media asset manager for collaboration over a network. Download a free 30 day trial at keyflowpro.com.