Philip Hodgetts, President, Lumberjack System
Terence Curren, Founder/President, Alpha Dogs Inc.
Josh Wiggins, Chief Commercial Officer, GrayMeta
Jim Tierney, President, Digital Anarchy
Laurent Martin, Cofounder and CMO, Aitokaiku
James DeRuvo, Film and Technology Reporter, DoddleNEWS
Larry Jordan: Tonight on the Buzz we look at artificial intelligence, and machine learning. What exactly are they? Should we be worried about our jobs, or should we embrace the new technology?
Larry Jordan: We start with Philip Hodgetts, the CEO of Intelligent Assistance. Philip’s first foray into AI was almost ten years ago, with First Cut which provided a way to automatically create a selects reel based solely on metadata. Tonight, Philip brings us up to speed on the latest AI technology.
Larry Jordan: Terence Curren, founder and president of Alpha Dogs joins us to explain the impact that AI will have on post production jobs, and what we need to do to prepare for the changes.
Larry Jordan: Josh Wiggins is the chief commercial officer for GrayMeta, a company that uses AI to screen and flag thousands of hours of media for objectionable content which can simplify repurposing films for different distribution requirements.
Larry Jordan: Jim Tierney, the president of Digital Anarchy has developed a plug-in for Adobe Premiere that automatically transcribes audio files into text, using artificial intelligence and cloud based servers. Tonight, Jim shares his views of the future of AI.
Larry Jordan: Laurent Martin founded Aitokaiku to create personalized music, live on your mobile device that reflects what you are doing right now. And he used AI to do it. Tonight, he explains what he did, how it works and what he sees as the future role of AI.
Larry Jordan: All this, plus James DeRuvo with our weekly DoddleNEWS update. The Buzz starts now.
Announcer: Since the dawn of digital filmmaking – authoritative – one show serves a worldwide network of media professionals – current – uniting industry experts – production – filmmakers – post production – and content creators around the planet – distribution. From the media capital of the world in Los Angeles, California, the Digital Production Buzz goes live now.
Larry Jordan: Welcome to the Digital Production Buzz, the world’s longest running podcast for the creative content industry, covering media production, post production and marketing around the world.
Larry Jordan: Hi, my name is Larry Jordan. Tonight’s show grew out of a blog I wrote a few weeks ago entitled I’m Worried about the Future of Editing. In it I described my concerns that as technology speeds toward adopting more tools, powered by artificial intelligence, many editors would get pushed out of work. That blog generated a lot of readers and a lot of response, so we wanted to explore this idea more fully tonight on the Buzz.
Larry Jordan: After I published my blog, I sent a copy to Philip Hodgetts, our first guest tonight. Philip has been working with AI for years as well as creating a number of critically useful utilities for video editors. While there seems to be no doubt that AI will impact video editing, there is a lot of debate as to the how and the when. Philip wrote a different take on the importance of AI as a response to my blog, so we invited him on tonight to help us explore this issue.
Larry Jordan: Other guests are all directly working with AI in some form of another. Terry Curren runs a post production house and has strong opinions on the impact this technology will have on jobs in the industry. Josh Wiggins, Jim Tierney and Laurent Martin are all creating tools that use AI to enable us to accomplish tasks that are not easy to do any other way.
Larry Jordan: I am fascinated by all these discussions and hope they enable us to learn more about this industry changing technology as well.
Larry Jordan: Now it’s time for a DoddleNEWS update with James DeRuvo. Hello James.
James DeRuvo: Hello James, what are you doing James? Oh sorry, I thought we were talking about AI.
Larry Jordan: We are talking about AI but that is not the same thing as 2001. What have we got on the news this week, what’s happening?
James DeRuvo: Intel announced this week that they’re rounding out their new i9 family of chips. Code named Basin Falls, the new i9 chip has up to 18 cores and a teraflop of processing power. They’re adding additional chips to the family in response to AMD’s lower priced Threadripper. Intel will add a Skylake X-series of i7 processors, as well as a lower priced Kaby Lake i5 which will also come and the architecture will have up to eight cores. But the i5s will not have hyper threading. Prices will start at around 1199 for the Basin Falls i9, and they will be available in September and October.
Larry Jordan: How does this Intel offering compare with AMD’s latest chips?
James DeRuvo: Considering that AMD’s Threadripper line costs about $1,000 less, and just has two fewer cores, it makes them quite attractive to those on a serious budget. Especially considering that the more cores you have, the slower your computer runs. So if you want your computer to run faster, but you want to have more power, that AMD Threadripper line becomes very attractive. So Intel was under pressure to respond and it looks like the chip war is going to continue.
Larry Jordan: Alright, that’s our first story. What’s number two?
James DeRuvo: Did you know that the Panasonic GH5 can actually shoot 6K video?
Larry Jordan: I did not, so tell me more.
James DeRuvo: Sort of. There’s a mode on this GH5 called “6K photo mode” and basically you use that to take 6K still images. But you can go into the settings and set the shutter for stop and start so that every time you press the button you start it, and then you press it again and you stop it. And what that basically turns it into, is a 6K video camera, shooting in H.265. The source footage will need to be converted to ProRes to be supported by most editing suites. Premiere does have a free plug-in that you can download that will read H.265 natively, but I’m sure you would agree with me, it would be a resource hog to play with it that way. So you have to jump through that hoop of converting into ProRes and the other drawback is that the 6K photo mode is in 4.3 so it’s not strictly 6K resolution. It’s more like 5K plus. But you could also upscale that to 8K.
Larry Jordan: What was the initial issue that this work around resolves?
James DeRuvo: When Panasonic came out with the GH5 and its 6K chip, many were left wondering why they couldn’t harness the entire chip to shoot video and this non-destructive hack will let them do it. But with the huge file sizes and additional steps in the workflow, you’re not going to want to do it all the time but to grab that beautiful sunset and get that extra bit of resolution and color gamut, it might be worth a try.
Larry Jordan: Alright, in the little bit of time we’ve got left, what’s our third story?
James DeRuvo: Finally Apple’s HomePod, the home assistant that’s coming out, developers are starting to dig into the firmware and they’re starting to find out details about the next generation iPhone. It looks like it’s going to have 4K in both the front and the back camera, as well as 60 frames per second. It’ll have a new bezel-less design that’ll likely have face detection to unlock. It’ll also have a glass back for wireless recharging and their tenth anniversary iPhone which I think might be called the iPhone Pro, will have an OLED screen.
Larry Jordan: What’s your sense for the next iPhone announcements?
James DeRuvo: I think we’re going to see some smart camera modes in the phone that’ll basically select the best scene detection for the lighting that it reads. There’s going to be something called freeze motion which will be able to fight motion blur, so you can get a better quality moving image. Based on the specs found on Apple’s new HomePod, it looks like Apple’s going to have a data upgrade for the iPhone 7S and 7S plus, and then they’ll add that tenth anniversary iPhone. But the real question is going to be, “Is it going to be worth the money?” Would you pay over $1,000 for an anniversary iPhone?
Larry Jordan: We’ll just have to see what it does before I’ll decide if it’s worth the money. James, what other stories are we following?
James DeRuvo: other stories we’re following include a portion of Leica and they go on the auction block and Zeiss looks to be the main suitor. That leaked spec from the HomePod also indicates that the Apple TV will finally be coming out in 4K, and here’s a question for you. What are the ten commandments for working on a film set? You’ve got that one as well.
Larry Jordan: There’s only one place to know, and that is what website?
James DeRuvo: All these stories, plus reviews and tutorials can be found at Doddlenews.com.
Larry Jordan: James DeRuvo is the senior writer for Doddlenews.com, returns every week with a DoddleNEWS update, and James as always, thank you very much, have yourself a good week.
James DeRuvo: You too.
Larry Jordan: Take care, bye bye.
James DeRuvo: Bye bye.
Larry Jordan: Here’s another website I want to introduce you to. Doddlenews.com. DoddleNEWS gives you a portal into the broadcast, video and film industries. It’s a leading online resource, presenting news, reviews and products for the film and video industry. DoddleNEWS also offers a resource guide and crew management platforms specifically designed for production. These digital call sheets, along with their app, directory and premium listings, provide in depth organizational tools for busy production professionals. DoddleNEWS is a part of the Thalo Arts Community, a worldwide community of artists, filmmakers and storytellers. From photography to filmmaking, performing arts to fine arts, and everything in between, Thalo is filled with resources you need to succeed. Whether you want the latest industry news, need to network with other creative professionals or require state of the art online tools to manage your next project, there’s only one place to go. Doddlenews.com.
Larry Jordan: Philip Hodgetts is a technologist and the CEO of Intelligent Assistance and Lumberjack System. Even better, he’s a regular here on The Buzz and I’m always delighted to say hello Philip, welcome back.
Philip Hodgetts: Hi Larry, thank you for inviting me.
Larry Jordan: It is always a pleasure because every time I have you on, I learn something. So that’s always a good thing. Philip tonight we’re looking at the impact of artificial intelligence on the media production industry, and to get us started, how would you define artificial intelligence, machine learning and neural networks?
Philip Hodgetts: It’s a very good question because these are three aspects that are kind of merged into one concept called artificial intelligence. Right at the top end, generalized artificial intelligence is the replacement human. Anything that you and I can do, it would be able to learn to do in the same way that we would be able to learn to do it. That’s kind of scary and fortunately a fair way off still because if machines become autonomous, then they may not find much use for the human infestation on the planet. So I’d rather leave that to the people who worry about that as their daily job is to be a worrier.
Philip Hodgetts: What I see as much more interesting is the augmented person, the augmented human. Steve Jobs talked about the bicycle of the mind, he wanted his computers to be a way of making a faster human, and he used the analysis that a cougar could outrun every human, but a human on a bike can often outrun a cougar. So bicycle of the mind is the machine learning component where we get tools that make it easier for us humans to do the work that only we can do. This is generally implemented with machine learning which is the sort of generic class over neural networks. Machine learning is what we do, the neural networks are kind of how we do it. So a machine can be taught, usually by training that machine and the machine consists of a bunch of neural networks inside in general broad terms. There are obviously variations on this, but let’s stick to keeping it understandable.
Philip Hodgetts: The machine is trained by showing it a lot of examples and telling it whether it’s getting closer to the output that you want or further away from the output that you want. Ultimately, the neural networks inside take this feedback and work out how to give you the result that you expect or want, even though we ultimately never know what’s going on in that neural network’s brain. Once you’ve generated a model like this, you can take that model and you could run that say on Apple’s ML kit, on an ios device. So you could have the modeling of the way your customers might behave or the way your equipment might behave, and you can run it in real time on an ios device.
Philip Hodgetts: A lot of what we’ve done with machine learning and this has been applied to very common tasks, there are application programming interfaces, things that a developer can simply call up on the internet and say “Here’s a file, give me back the text of that file.” There’s a number of people doing these, what I group under the heading of cognitive services. It’s a sub-set of machine learning because there’s machine’s that have already been trained by somebody else, IBM Watson for example have trained machines using neural networks how to take speech and convert it into text. How to recognize images, how to detect emotion, how to identify the emotion, how to extract key words, concepts, identities, all of these things can be done right now and accessed by anyone from IBM Watson, from Microsoft Cortana, Google have a bunch of APIs as do Amazon.
Philip Hodgetts: And of course you can mix and match these APIs, but generally these will be called by a programmer so as I said, the machines can be trained to do whatever we want them to learn to do, whether it just be learn to better predict what apps I might look at at six o’clock at night and then I swipe to search on my ios device. Through to making a speech detects engine. So I think that’s the overview of what artificial intelligence encompasses right now.
Larry Jordan: Now the tech industry generally follows the philosophy of inventing whatever they can, and then worrying about the impact of that invention on society after the fact, and I call this the law of unintended consequences. What do you see as the fallout from implementing more and more tools based on AI?
Philip Hodgetts: Well, obviously they’ll have an impact on employment but I think Terry Curren is going to address that later. It would change the workflows that we have. Say for example, you’re working on a documentary or reality program. You’ve got a bunch of material that’s come in, currently it’ll be transcribed overnight by interns or via service, but now that we have engines we can do this in less than real time, faster than real time, and that’s faster than the real time of the longest interview that you might have, or the longest take you might want to transcribe because it all happens in parallel. So, we have back almost instantly, transcription, key word extraction, and concept extraction. The difference between key words and concepts is key words are saying what you see, which are the most important words that are used in the transcript, hence key word. Whereas concept is, instead of speaking about BMW, Ford and Chrysler, we’re talking about motor cars. So a concept extraction engine would understand that these are all motor cars or automobiles, and would categorize them that way.
Philip Hodgetts: Imagine how important it would be for reality TV if people could identify where in all of the 60 hours of material from today’s shoot are the high emotion points? And even nicer if we know whether they’re happy or sad, or angry. IBM Watson currently detects five different emotions. They have said that they’re hoping to work on more positive emotions over time. Happy is the only positive emotion they can currently identify but they’ve got four negative emotions there. Recognizing images, so we’ll be able to detect whether this person is the same person we’ve got in other shot time and grouping them together in shots. Waiting for the moment for somebody to add a name to it. We’ll get concepts like there’s a dog or a Rottweiler in this shot, there are three people, a Rottweiler and they’re in front of a cathedral. That metadata will come back automatically from the image recognition engines. And this doesn’t require any programming or any machine learning concepts to be done by the actual developer. These are things that anybody can draw on right now, for very small amounts of money.
Larry Jordan: About ten years ago, you created the precursor to Lumberjack, a program called First Cut. What was your thinking in creating that program back then?
Philip Hodgetts: We were looking at an era long before any sort of machine learning or artificial intelligence was available, so we built what was known at the time as a knowledge system, where you modeled the knowledge of experts and build a system that replicates that knowledge in some software form. You build it into an algorithm. So for First Cuts, the algorithm was doing a very quick version of what the neural networks do, and I would make a cut, and we’d make some rules as to how to make a cut based on metadata, we’d run that algorithm. I would say what was wrong with it, then I’d have to translate what I would do instead into a rule of thumb that then could be implemented. So essentially it’s just a bunch of really good rules of thumb of how to build a story together is what we did in First Cuts.
Philip Hodgetts: I don’t know whether machine learning would be a faster way to do that these days or whether the modeling of human knowledge makes it a little bit faster, because the business of editing is very complex. We can have AI teach themselves certain things like recently Google had an AI teach itself how to walk. But then there’s only really two rules for teaching, move forward and don’t fall over. I don’t think I could distill down even the simplest wedding algorithm into terms like that because we challenged and developed autonomously. But if I could find enough examples of good weddings, when they were graders, say “This is a good example of a wedding video and this is a bad example of a wedding video,” I could run those through a machine, set the machine up and we would probably derive a machine that would then be able to run and build wedding like videos on the fly.
Larry Jordan: Philip, how fast do you expect AI to start materially affecting jobs?
Philip Hodgetts: I expect to see the first implementations available to everyone’s software this year. So probably because overall we’re a fairly conservative industry, it’s still going to be two or three years before we start to see any real effects on jobs. This will be more powerful for independent filmmakers like Cirina Catania who have good technical knowledge but struggle with organization and extracting a story. So, for people like that I think it will empower people very quickly. Moving into the studios, we’re probably looking at eight, nine or ten years before anything changes in those very regimented systems.
Larry Jordan: So what should we do to best prepare ourselves for the future?
Philip Hodgetts: Well if you want to stay in the media entertainment industry, learn how to tell stories regardless of the method that you use to tell story. The core storytelling skills have been pretty constant from the cavemen around a camp fire to the present day feature film. We have to engage interest and provide a story that keeps people along with you. So I think preparing to be adaptive, preparing to learn new skills is also one of the most important features going forward. I like to call it constructive forgetting because you’re going to have to forget things that were absolutely true five years ago, because they’re no longer true as absolutes. We’ve both see that so much in our careers so far, it’s obviously going to continue that way that the change will be there, but we need to be flexible and open to the change, embrace it and become the best bicycle of the mind rider that we can be.
Larry Jordan: I’m delighted to say the older I get, the easier it becomes to forget, so I think I’m perfectly positioned for this new technology.
Philip Hodgetts: I’m fine with that too.
Larry Jordan: Philip, for people that want more information about where and what you’re thinking, where do they go on the web?
Philip Hodgetts: The where I’m thinking is at philiphodgetts.com, Philip with one L. For our business applications, there is intelligentassistance.com and lumberjacksystem.com.
Larry Jordan: Philip Hodgetts is the CEO of both Intelligent Assistance, and Lumberjack System. Philip, thanks for joining us today.
Philip Hodgetts: My pleasure.
Larry Jordan: Take care, bye bye.
Larry Jordan: Terence Curren is the founder and president of Alpha Dogs, a Burbank based post production facility that he started back in 2002. Terry is also the host of the Editors Lounge, a regular gathering of post production professionals interested in improving their craft. Hello Terry, welcome back.
Terence Curren: Thanks Larry, I always enjoy these confrontations.
Larry Jordan: I always enjoy talking with you, because if there’s one thing I can count on, it’s the fact that you never ever have an opinion.
Terence Curren: I guess you know me well.
Larry Jordan: Terry, we’ve just discussed the basics of artificial intelligence with Philip Hodgetts, so what I’d like to do is to focus more on its impact in our industry with you. So, what are your thoughts? Is AI a good thing, or are we doomed?
Terence Curren: Wow, that’s a great question. Well let me put it this way, ultimately we are doomed I think. But right now, what’s really important is to focus on being creative. The reason I put it that way is that, yes, at some point in time, artificial intelligence will become as intelligent as human beings, and then one second later, it’s smarter than we are, and a day later we’re cockroaches compared to it. When that happens, which is hopefully a long way off, the rosy predictions are like 2040. The less rosy predictions are much later than that, so it’s a ways off, but there are stages, as Philip’s talked about of AI, and the ones that are going to immediately start replacing jobs in our industry are the mundane jobs, logging footage, syncing dailies, that kind of stuff, which is why I tell people, focus on the creative, because that’s the hardest thing for AI to do, the creative thing. That muse that strikes and gives you an idea of how to put two things together that you shouldn’t put together, that all the rules say don’t put together, but somehow it makes an amazing end result. So that part is going to be the last thing to be replaced so if you want to be in this industry and to continue to work, focus on the creative.
Larry Jordan: Now when you say focus on the creative, it sounds like what you want us to do is look more at the craft of editing as opposed to the technology?
Terence Curren: Exactly. Because from the technology standpoint, that’s going to be the easiest thing to replace, technologically so to speak. Philip ten years ago was showing his First Cuts which would string out footage into a basic rough cut and then the editor could go in and just fine tune it. That eliminates the mundane part which is what he was trying to do. But it also eliminates a job that an assistant editor would traditionally do. So if you’re doing anything that’s very repetitive, if it’s something that someone can be taught within a few days, you’re probably going to get replaced sooner than later. That’s why I really recommend focusing on the creative side because that’s the part that’s going to be the hardest thing to replace with artificial intelligence in the long run.
Larry Jordan: Putting aside the emotional aspect of people losing their jobs which is very similar to the old joke of “Other than that, Mrs Lincoln, did you enjoy the play?” Is it really a bad thing that we’re streamlining our workflow?
Terence Curren: No. I’m not a Luddite. I think that the technological improvements that we’re making everywhere, not just in our industry, but in general, make our lives better and have the promise of making our lives better throughout. The problem is that we have a society based on a minimum of a 40 hour work week work ethic and we’re moving to a society where there just won’t be that much work that needs to be done, and how do we restructure our society so that we can enjoy the benefits of all of this technology, and not feel guilty that we’re not working hard enough?
Larry Jordan: Or more importantly, how can we enjoy the benefits of all this technology, and live while making less money?
Terence Curren: Yes, which is the whole universal basic income discussion which is probably an entirely different show.
Larry Jordan: If you’re a young editor starting out, what advice would you give? Should they embrace this technology, should they fight against the technology, and how should they structure their career?
Terence Curren: Ooh, well, if somebody was starting out now, and wanting to be an editor, I would tell them the same thing that I was saying back when I was teaching editing classes around 2000. That is, if you can imagine yourself doing anything else for a living, you should go do it, because our industry is so difficult and so competitive to get in and then make a decent living at, that unless you can’t imagine doing anything else, you probably won’t have the drive for the long run that it takes to build a career. That said, if you are one of those people who can’t imagine doing anything else, then just do it. Edit as much as you can, edit for friends, look at your local film school and offer to edit director’s projects. Wherever you can edit, and that’s how you get the chops, and the connections that eventually will lead to a career.
Larry Jordan: Terry for people that want more information about what you’re doing and Alpha Dogs itself, where can they go on the web?
Terence Curren: Alphadogs.tv for Alpha Dogs, editorslounge.com for the Editors Lounge, and theterenceandphilipshow.com for your dose of Philip and I debating various things.
Larry Jordan: Bring extra coffee when that occurs. That website is alphadogs.tv and Terence Curren is the founder and president of Alpha Dogs. Terry, this has been fun, thanks so much for joining us.
Terence Curren: Thanks for having me Larry.
Larry Jordan: Josh Wiggins is the chief commercial officer for GrayMeta which is a company that leverages machine learning and artificial intelligence along with metadata and content workflows for the media and entertainment, law enforcement and healthcare industries. Hello Josh, welcome.
Josh Wiggins: Hello there.
Larry Jordan: How would you describe GrayMeta?
Josh Wiggins: GrayMeta is a relatively new company and as you said in your intro, we’re leveraging machine learning and artificial intelligence to help enterprises in different sectors connect to their content and data that they have, and really streamline and bring efficiencies to workflows which predominantly have been leveraging human capital to scale.
Larry Jordan: How do you differentiate between machine learning and artificial intelligence? What’s the difference?
Josh Wiggins: It’s a great question and I think if ten different people asked it and ten different people answered, you’d get a range of answers. My viewpoint here is that machine learning is great technology that we have that’s allowing us to replace the creation of metadata that was getting done by humans and loggers, and is getting done by machines in the area that I talk about with video content and images. So it’s training a machine to understand what is actually in the image, like what we have on our iPhone or on our new Samsung phone. Now it’s the artificial intelligence that’s using that metadata to be able to make a decision on behalf of something that a human would have done.
Josh Wiggins: So as an example, in media entertainment, compliance is an area where for years you’ve had humans have to look at a video or a movie to understand if certain things are good or bad or allowed for different markets. Now what we can do is use machine learning to actually identify if there’s swearing or nudity or certain religious flags in the content and then make a decision that it should be flagged and sent to the next step in the workflow without necessarily having to have a human look at every single frame. So that’s how machine learning is creating metadata, and my view is that artificial intelligence is using that metadata to make a decision that a human would have previously made. Or, in some cases, AI is there to complement the human because machine learning isn’t 100 percent perfect at the moment.
Larry Jordan: So machine learning is the process, and artificial intelligence is the result?
Josh Wiggins: You could go there. Many would disagree, but I believe that’s a good summary based on our point of view at GrayMeta.
Larry Jordan: Alright. Within the point of view of your company, where do you see the biggest opportunity for AI? Is it in production, or post or distribution?
Josh Wiggins: I think it’s in the middle. Alta distribution. There’s a lot of areas where it can help in production. We’ve seen a lot of people whose job is around dailies, get really interested in this. While we’re a little bit cautious because it kind of opens up to a lot of different security constraints that you have to deal with with pre-released content, and applying machine learning to the dailies process, and we’re starting to see camera manufacturers embrace it, and people that are creating tools for the daily space, with facial recognition to identify who the actors are, I think that’s going to start to bring efficiencies. I think you’re going to run into some of the unions and what someone should do, and the profiles of roles on the production side, but more so definitely in distribution. There’s a huge opportunity here to streamline and cut some of the costs out of what’s been going on in distribution and allowing people to get content to markets a little bit quicker, and really improve search and recommendation of content which I think definitely needs some work.
Larry Jordan: Well technology companies tend to view AI as an enabler, while many editors are concerned that it’s going to cost them their job. What’s your perspective?
Josh Wiggins: That’s a good question.
Larry Jordan: They’re easy to ask.
Josh Wiggins: Yes, I don’t spend the majority of my time in the post and the editorial side when the good news is you’re able to work with partners that are in that. I think it’s going to have an impact in the editorial process. We’re already seeing that, whether it’s redacting a face or editing out a logo of a brand, that’s something that’s done today where someone books an edit room and they’re going to find all of these brands that might be on a soda bottle in a UK show and you need to edit it out for a different version in the US. I do believe we are very close to not necessarily needing a human as part of that process, or maybe to verify. And that will have an impact. You can detect a logo or a brand, or an image and you can use that data now to feed an EDL and actually go directly through and automate a redaction or a blur. We’re actively working on that with a number of our partners that are in the editing space or in the media asset management. So that’s coming, and it is going to have an impact on what people do.
Larry Jordan: A lot of what your software can do is to scan huge amounts of footage in a very fast amount of time, so rather than having a person sit there and watch these thousands of hours of media, you can do that using your software. Is that correct?
Josh Wiggins: Yes, that’s a great summary. What we’re able to do is plug into the best of the best of the machine learning and AI platforms. Our platform and our company really aggregates together the right services and actually makes it usable by the customer. So we work with a lot of the large providers out there like Amazon, IBM, Google and Microsoft, but then we’re also working with some of the smaller niche providers and I think that’s what really helps us get the right metadata service for that particular client. Your point there about being able to analyze all of the data, we work in law enforcement and we’ve started to look at the problem there with body cams and what’s interesting is, there just isn’t enough human horsepower to review all of the footage. Same could be said if someone just acquired a whole library and you need to review it. So what we’re doing is being able to use this metadata to say, “Well OK, on the 7,000 hours of content, these 200 hours have profanity and swearing in it.” Now you’ve actually got action. The key is taking action, and that’s what we’re really trying to do in our platform, allowing people to take action quicker, and reduce the amount of money they have to spend in doing so.
Larry Jordan: We’ve talked about the fact that AI can be applied to production or post or distribution, but where do you see AI having the biggest impact, and what do we need to change to make that even more effective?
Josh Wiggins: There’s an area where I think this can be used, and it’s in the rights, or it’s in the actual buying and selling of content. I think this is an area which always gets forgotten. When a studio sells a particular piece of content to a broadcaster, in that deal there are pieces that go along with it, the video, the marketing collateral, the billboards and the stills. But I think what’s missing is this metadata that’s created by machine learning should go all the way through the process. So when a studio is creating metadata about compliance for an in flight review of that piece of content, if there’s a broadcaster in Spain that’s buying it, they should be able to get that as well. It shouldn’t just be the movies. So I think the machine learning and AI power needs to carry it all the way through to the person that buys the content. And that’s got to happen at the deal level, places like Con and Mipcom where people are buying and selling content. I don’t believe that people are having the conversations about AI machine learning in those conversations.
Larry Jordan: For people that want more information about what GrayMeta is doing, where can they go on the web?
Josh Wiggins: They can go to our website which is www.graymeta.com.
Larry Jordan: That’s all one word, graymeta.com and Josh Wiggins is the chief commercial officer for GrayMeta, and Josh, thanks for joining us today.
Josh Wiggins: Great, thank you very much Larry.
Larry Jordan: Jim Tierney founded Digital Anarchy in 2001 specifically to develop plug-ins to simplify creating visual effects. And this week he’s in beta with a new one. Hello Jim.
Jim Tierney: Hey Larry, how’s it going?
Larry Jordan: Well, it’s going great because tonight we’re looking at the impact of AI on the video industry and you’re currently in beta with Transcriptive, which is directly relevant to this discussion. Tell us about the product.
Jim Tierney: We’re using AI machine learning to transcribe video, doing it automatically. It’s a plug-in for Premiere Pro so everything’s totally integrated into Premiere and so we’re exporting the audio out of Premiere, uploading it to the speech services, and getting back a transcript that then appears within Premiere.
Larry Jordan: Where does AI get involved?
Jim Tierney: Well, we allow the user to use one of two speech services. Watson, which is IBM’s offering, and another one called Speechmatics, which is the better of the two, to analyze the audio and transcribe it. So the AI is in the speech services.
Larry Jordan: Recently I reviewed another automated transcription service and discovered that it had problems with technical jargon and proper nouns. What factors reduce the accuracy of these automated transcripts?
Jim Tierney: You know, exactly that. The quality of the audio matters, a lot. It can handle accents, but it helps if the person is well spoken. That’s a big deal. Certainly technical jargon is going to be a little bit problematic, uncommon names are going to be a little bit problematic. So there’s definitely accuracy issues with some areas which is why it’s still going to require a little bit of clean up for sure.
Larry Jordan: With AI does the system get better the more it listens or is AI just the process of being able to create the text in the first place?
Jim Tierney: That’s why we have the speech services as the back end because they’re the ones doing the training, and that’s their specialty. Our specialty is creating tools for video editors. So, they’re focused on doing all the training, and the more stuff that gets thrown at them, the more it learns. But that’s more on the speech services side.
Larry Jordan: Now back to Transcriptive. What languages does it support?
Jim Tierney: A lot.
Larry Jordan: More than US English?
Jim Tierney: Yes. It’s what the speech services are supporting, so it does a great job with English, Japanese, supports all the European languages and so there’s quite a few and there’s some other ones out there as well. I don’t know the entire list but it’s pretty comprehensive actually.
Larry Jordan: What does it cost?
Jim Tierney: The list price is 249.
Larry Jordan: I’m sorry, say again.
Jim Tierney: 249.
Larry Jordan: 249. Now you’re in beta now, have you set a release date?
Jim Tierney: We’re shooting for probably the next two or three weeks, we’re pretty close to release.
Larry Jordan: I’ll keep my fingers crossed for you, that’s always a nerve wracking time, just before you release a product.
Jim Tierney: Absolutely. There is a charge for the speech services that runs like two cents a minute or something like that. But comparative to traditional transcription that’s pretty inexpensive.
Larry Jordan: For people that have done traditional transcription, there’s the person that transcribes it, and then generally an editorial person or two that looks over the quality of the transcript. So generally the transcript we get back traditionally has a higher quality than this. What’s the biggest advantage to using automated transcription?
Jim Tierney: It’s a lot cheaper. I think the quality of Speechmatics is pretty close to what you’re going to get from a lower cost transcription services. At least with good audio. So there’s that. You have it all within Premiere so the text is searchable which means you can jump to where that text happens on the timeline. Although we just did implement a feature where if you have regular transcripts through a normal source, you can pull that into Premiere and Transcriptive will conform it to the audio in your timeline. So it’ll sync everything up between the text and the audio.
Larry Jordan: Well put your futurist hat on, this is the last question I’ve got for you. What do you see as the future of AI in our industry, and what would you suggest to editors who are concerned about it taking jobs?
Jim Tierney: I mean I think it’s a little bit of a blow, and I think on the lower end you probably will see it taking some jobs. If you’re doing wedding videos or music videos or stuff like that, stuff where there is definitely enough training data for the AIs to do that. You know, there might be some instances where there’s a $50 web service if you want to give all your guests iPhones and let them shoot the video of your wedding and then let the AI deal with that mess. That’s going to be available I think fairly soon. But so much of video editing is managing the client, and client relationships, and then being able to interpret what the client says and clear out what they really want. And I think AI’s a very long way from dealing with that type of thing.
Larry Jordan: Jim, for people that want more information about Digital Anarchy and Transcriptive, where do they go on the web?
Jim Tierney: They go to digitalanarchy.com and there’s a button on the front page that allows them to sign up for the beta.
Larry Jordan: That’s all one word, digitalanarchy.com and Jim Tierney is the president and founder of Digital Anarchy. Jim, thanks for joining us today.
Jim Tierney: Thanks Larry.
Larry Jordan: Take care bye bye.
Larry Jordan: I want to introduce you to a new website, Thalo.com. Thalo is an artist community and networking site for creative people to connect, be inspired and showcase their creativity. Thalo.com features content from around the world with a global perspective on all things creative. Thalo is the place for creative folks to learn, collaborate, market and sell their works. Thalo is a part of Thalo Arts, a worldwide community of artists, filmmakers and storytellers. From photography to filmmaking, performing arts to fine arts, and everything in between, Thalo is filled with the resources you need to succeed. Visit Thalo.com and discover how their community can help you connect, learn and succeed. That’s Thalo.com.
Larry Jordan: Laurent Martin trained as an opera singer in Los Angeles then sang professionally in Germany. However, in 2012 he cofounded Aitokaiku which is a mobile application that creates original music using artificial intelligence. Hello Laurent, welcome.
Laurent Martin: Hi Larry, thanks for having me.
Larry Jordan: It’s good to have you back, because the last time we chatted was earlier this year, in January but in tonight’s show, we’re looking at different ways that our industry is either using or being affected by AI. So just to get us started, let’s talk about your company Aitokaiku. What is it?
Laurent Martin: We are a music technology company and the technology that we make creates music from your world. It transforms your world into music and the way we do that is we take sensor data and really it could be any data stream, and we compose live music for that. So our mission in life, is to create music that is personal and to really personalize every note of music that you hear. It should always be you that is in the music, your life, your activities.
Larry Jordan: Where does AI fit into this?
Laurent Martin: The first thing I want to say about AI and I don’t want to be too provocative, but we think at Aitokaiku that music AI is very boring. The reason that we think it’s boring is the same way that flour is boring. But it’s also essential, and you can’t make bread, you can’t make cake, unless you’ve got the flour, and so it’s sort of the same way with AI. Our music composition engine is something we test a lot of different technologies with, not just AI, also algorithmic composition, probabilistic composition, so there’s a lot of different things that go into that and that’s changing and evolving all the time. For us though, just the raw composition aspect of it has to be a given in the same way that for any type of musician, proficiency is the minimum standard.
Larry Jordan: It sounds to me like what you’re doing is you’re taking the sounds of the environment, and turning it into something musical. In order for us to recognize it as music, we have to have some sort of composition underlying it, and the composition engine is what you’re using the AI for. Is that a true summary?
Laurent Martin: That’s a great way to summarize it, and you know, all the things that we can do in the real world with sensors, we can also virtualize, so we also have an app called vimu, video music for ios and that is able to create and compose music based on the action inside the shot of the video. So no matter what it is that we’re using is that input, it’s that thing that gives it personality, that gives it a human connection, a connection with somebody’s actual experiences in the real world, whatever it is we’re using. Whether it’s in that virtual space as a video or whether it’s in somebody’s real life that we do with microphones and movement sensors and things like that. We think that the interesting point happens when somebody creates that music themselves and with their life, and it’s not just a black box.
Larry Jordan: So what you’re doing is you’re creating music which is tailored to whatever’s going on at that instant that’s being picked up say by the microphone or the cell phone, and turning that into music? What is the role that AI plays in that?
Laurent Martin: There’s a couple of different roles that it plays. There is the intelligence that goes into the composition. There’s also a significant amount of data that we are processing around the user behavior, and so that is just as important. What it is that our machine can learn from someone’s actual environment and by taking those data streams and understand the signal amongst the noise, so to speak, and then processing it on the other side as part of the composition engine. We use different ways to do that. We’re still experimenting and finding what makes the best music for our particular platform, and this is something that is changing so fast as well, something that we’re not only invested in in terms of our own teams and our own product.
Laurent Martin: We also hosted this last month the Music Information Retrieval meet up in Berlin and what that is is a monthly meet up for professionals in music information retrieval, but these are data scientists that work in the field of music technology. This is so cutting edge and it moves so fast that it’s only by having this regular sort of contact with professionals all throughout the field that we can really stay in front of it, and I think that something I want to bring up about AI is that it’s not this static thing. The pace of innovation on that is so much faster than in many other fields of music technology, that it’s something everybody’s got to watch out for. But I also think that there needs to be a bigger horizon as well.
Larry Jordan: You’re a musician, you’ve been classically trained and performed as a musician, and yet all of the composition that has been done by humans over centuries, is now being done by machines. Should we worry about that?
Laurent Martin: I have a very different perspective I think on this than a lot of other classical musicians. When I went to university I started with my bachelor’s degree at the University of California, Santa Cruz. David Cope is a professor there. He did literally write the book on algorithmic music composition and was one of the earliest and most prolific innovators in computer composed music. So from the age of 17 it was not unusual to see people performing music that was created by a computer in the concert hall. There were certainly a lot of friends and colleagues that were all making music that way and we were performing their music created by computers. So I don’t maybe experience the same separation that a lot of people have with that. A lot of different composers in the past have created their own style, their own individual algorithms which they wrote music with, and AI is just another tool that helps humans do that even faster and better, and have more creativity than they otherwise could. So for me, I don’t think I experience that like a lot of other classical musicians.
Larry Jordan: You mentioned the Music Information Retrieval group. Tell me a little bit more. Why was that started, and what benefit does it provide?
Laurent Martin: The Music Information Retrieval group is a monthly meet up of academics, music data scientists, music informatics researchers and people throughout the music tech industry here in Berlin. We present every week on different technologies, ranging from how do you assemble the data into manageable data sets, to start using and understanding music? It could be audio signal processing, it could be manipulating symbolic music, so notational music, and it goes into then composition as well. So it’s really a lot of diverse research and innovation that’s going on, but kind of the cutting edge and Berlin is a really great place for that. We also are a founding member of Music Tech Germany, which is the world’s first trade group for music tech. That’s the thing about Berlin is we have SoundCloud, Ableton, Native Instruments, some of the biggest companies in music tech are all based here in Berlin, and so it’s really great not only to have these meet ups like this Music Information Retrieval group where we can share ideas, but also a trade group where our business interests are represented and where there is a sense of community that we have here and trying to push the entire industry forward.
Laurent Martin: There’s a lot of camaraderie, there’s a lot of sharing ideas, and it’s something that as much as it’s important to go into our office and try to push ourselves, we’ve also got to go out there and look out for our peers in our industry, and make sure that we are on the cutting edge every single day.
Larry Jordan: Is there a website for the Music Information Retrieval group?
Laurent Martin: You can find it on meetup.com and probably looking at Berlin Music Information Retrieval Group and you can find the music tech Germany trade group at Musictech.de.
Larry Jordan: That’s musictech.de?
Laurent Martin: Yes.
Larry Jordan: One other thing, getting back to you and your company, when we spoke in January, you just had an android app. You’ve mentioned an ios app. What do the two apps do?
Laurent Martin: Our android app that is currently out is a prototype for making music from sensors from someone’s world. We tried to virtualize that experience within video, and so we created Vimu, that’s video music, and that creates an instant music video as you shoot. The music that’s created is actually reactive to the colors and the motion inside the shot. So it’s all real time, there’s no post production, and everything syncs with the action in the shot.
Larry Jordan: Is this calculation done on the phone itself? Or are you beaming data back to some super computer and doing processing at the back end?
Laurent Martin: We are very fortunate to have a very talented CTO, Matti Sarja, who is able to make all of those things happen on the device. This for us is really important, that people feel like this is something that belongs to them. It’s created on their device, and then we have coming up in a couple of weeks, Somu, social music. That is a networked app and that will be for android, but that’s going to be our flagship mobile app. The idea there is that the music that you’re creating from the sensors that are taking in your world and activities, you can share that music with another person, and when they’re listening to it, the sensors on their device are mixing their world into the music, so it’s dynamic, co-created music. It’s all live, and this is something that gives people that musical sense of jamming and connection, so that is obviously not all happening locally on the device, but when you do that, when it broadcasts somewhere else, it means you’re connected to another individual, and that is really important for us.
Larry Jordan: Where can we go on the web to learn more about these applications?
Laurent Martin: You can check out aitokaiku.com and you can also find us on Facebook and Twitter.
Larry Jordan: Laurent Martin is the co-founder and chief marketing officer for Aitokaiku, and Laurent, thanks for joining us today.
Laurent Martin: Thanks for having me Larry, bye bye.
Larry Jordan: AI will not leave our industry unchanged, and as we heard tonight, a lot of jobs are going to evolve or slowly disappear over time. I think the key is to focus on what we do better than machines, which is to be creative. Continue developing our storytelling skills and make sure our businesses are running as efficiently as possible. It won’t do anyone any good to hide from AI, just the opposite. The more we learn about what it can do, the more we’ll understand what it can’t. And filling that creative gap is where the exciting work will be in the future.
Larry Jordan: I want to thank this week’s guests, Philip Hodgetts from Intelligent Assistance, Terry Curren from Alpha Dogs, Josh Wiggins from GrayMeta, Jim Tierney from Digital Anarchy, Laurent Martin, president of Aitokaiku and James DeRuvo from DoddleNEWS.
Larry Jordan: There’s a lot of history in our industry and it’s all posted to our website, at digitalproductionbuzz.com. Here you’ll find thousands of interviews, all online and all available to you today. Remember to sign up for our free weekly show newsletter that comes out every Saturday.
Larry Jordan: Talk with us on Twitter @DPBuzz and Facebook at digitalproductionbuzz.com.
Larry Jordan: Our theme music is composed by Nathan Dugi-Turner with additional music provided by Smartsound.com. Text transcripts are provided by Take1 Transcription. Visit Take1.tv to learn how they can help you.
Larry Jordan: Our producer is Debbie Price, my name is Larry Jordan, and thanks for listening to The Digital Production Buzz.
Larry Jordan: The Digital Production Buzz is copyright 2017 by Thalo LLC.