SE Radio 695: Dave Thomas on Building eBooks Infrastructure
Dave Thomas, author of The Pragmatic Programmer, The Manifesto for Agile Software Development, Programming Ruby, Agile Web Development with Rails, Programming Elixir, Simplicity, and co-founder of the Pragmatic Bookshelf, speaks with SE Radio host Gavin Henry about building infrastructure for eBooks. They discuss what an eBook is, the various formats, what infrastructure is needed to build them, how an author writes an book, the history of the Pragmatic Bookshelf, how they have evolved, how to handle links within eBooks, why humans are so important in the writing process, and why AI can help with your writing -- once you've written your content. Thomas discusses PDFs, eBooks, Mobi files, ePub files, CI/CD pipelines, WYSWYG, Markdown files, Pragmatic Markup Language, embedding code, AI agents, images, printing PDFs, JVMs, Java, jRuby, and how Markdown won the plain text writing format wars.
Brought to you by IEEE Computer Society and IEEE Software magazine.
Dave Thomas, author of The Pragmatic Programmer, The Manifesto for Agile Software Development, Programming Ruby, Agile Web Development with Rails, Programming Elixir, Simplicity, and co-founder of the Pragmatic Bookshelf, speaks with SE Radio host Gavin Henry about building infrastructure for eBooks. They discuss what an eBook is, the various formats, what infrastructure is needed to build them, how an author writes an book, the history of the Pragmatic Bookshelf, how they have evolved, how to handle links within eBooks, why humans are so important in the writing process and why AI can help with your writing – once you’ve written your content. Thomas discusses PDFs, eBooks, mobi files, ePub files, CI/CD pipelines, WYSWYG, Markdown files, Pragmatic Markup Language, embedding code, AI agents, images, printing PDFs, JVMs, Java, jRuby, and how Markdown won the plain text writing format wars.
Brought to you by IEEE Computer Society and IEEE Software magazine.
Show Notes
Related Episodes
Other References
Transcript
Transcript brought to you by IEEE Software magazine.
This transcript was automatically generated. To suggest improvements in the text, please contact [email protected] and include the episode number and URL.
Gavin Henry 00:00:18 Welcome to Software Engineering Radio. I’m your host Gavin Henry. And today my guest is Dave Thomas. Dave Thomas has been consulting and programming for 50 years and still writes code almost every day. He’s an author of The Pragmatic Programmer , one of my personal favorites, especially the 20th edition, The Manifesto for Agile Software Development, Programming Ruby, Agile Web Development with Rails, Programming Elixir, and I’m not finished yet. More recently, Simplicity. He currently runs the Pragmatic Bookshelf, and in his spare time he is looking at better ways of writing OO code. O — what does that stand for?
Dave Thomas 00:00:56 Object-oriented of course.
Gavin Henry 00:00:58 Dave, welcome to Software Engineering Radio.
Dave Thomas 00:01:01 Well, thank you. I really appreciate being here. It’s great to chat.
Gavin Henry 00:01:04 Is there anything I missed in that impressive bio of yours or are you happy with it?
Dave Thomas 00:01:08 Well, not really. I mean, I do a lot of random stuff. Basically my wife accuses me of just doing the stuff I’m interested in, which is probably true. I’m a bit of a magpie when it comes to the things I’m interested in, so I’m never quite sure what I’m going to be attacking on a particular day or month.
Gavin Henry 00:01:25 That’s a great way to live. I like it. So before we start, I’ll mention a small disclaimer. I’m pretty close to finishing a book I’m writing for your company, the Pragmatic Bookshelf, about Rust and C. So I actually know how some of this stuff works from the author’s perspective, so I’d just like to say that.
Dave Thomas 00:01:42 And what are you doing talking to me and not finishing the book off?
Gavin Henry 00:01:45 Well, probably commitments. I was doing it this morning, don’t worry.
Dave Thomas 00:01:49 Oh good, good.
Gavin Henry 00:01:50 Anyway, let’s begin. So the show is called Building eBooks Infrastructure. So we need to lay the foundation. So I’d like to start with an overview of what an eBook is.
Dave Thomas 00:02:02 So an eBook is fundamentally just the text of a book in some kind of accessible format to electronics. In the old days, which was 15, 20 years ago, there were probably half a dozen different standards for what made an eBook. But nowadays we’ve pretty much settled on a format called EPUB. And EPUB is actually — I mean, if you have an EPUB file, you can actually just run unzip on it and if you unzip an EPUB, you’ll find that it’s a bunch of HTML, a bunch of assets and then a couple of manifest-type files that give it navigation and metadata and all that kind of stuff. So whenever you feed an eBook into your reader, whether that’s a Kindle or an Apple Books or whatever it might be, all it’s doing really is unzipping that and then running some kind of browser clone to actually display the content.
Gavin Henry 00:03:00 You mentioned there used to be other formats. I remember always having to send a Mobi file. I think you can still download those when you buy a book from you.
Dave Thomas 00:03:09 No, they’ve, Amazon have stopped distributing Mobi’s, I believe. I may be wrong about that because now in the old days, you’re right, we had to give Amazon a Mobi file and that was always a real pain because Mobi’s were different in terms of the stuff that they would support to regular EPUBs. So getting compatible files between Mobi’s and EPUBs was always a challenge. But eventually Amazon gave in and switched across. So now we submit EPUBs to them and internally they do change them into something for the Kindle. I don’t know exactly how they do that, for all I know it may well be Mobi still, but yeah, we never now see a Mobi file.
Gavin Henry 00:03:53 And was that mainly because EPUB was a standard or is a standard and Moby wasn’t?
Dave Thomas 00:03:58 Yeah, I think they were constantly playing catch up with the EPUB and frankly it just wasn’t worth it because I think initially, they went Moby because it gave them the ability to control stuff a bit better on the Kindle. But eventually EPUB does everything Mobi does plus a lot more. It’s a way more flexible format. And so I think they just said, well, why are we fighting a technology war that we don’t really get any benefit from? So they just switched across.
Gavin Henry 00:04:28 Did they give you any help to render and produce them?
Dave Thomas 00:04:31 We didn’t have to do anything. We just, we were always generating Mobi’s and EPUBs as part of our process and so we just basically stopped sending them Mobi’s.
Gavin Henry 00:04:41 I see. And was there a big difference, it seems to be that they were playing catch up, you said.
[...]