How to Become an AI Data Engineer
Manage episode 474684305 series 3561447
Do you want to be a data engineer? In this episode, host Alba Joven speaks with Oracle Autonomous Database specialist Javier de la Torre Medina about what it takes to become an AI data engineer.
Episode Transcript:
00:00:00:00 - 00:00:34:15
Unknown
Welcome to the Oracle Academy Tech Chat. This podcast provides educators and students in-depth discussions with thought leaders around computer science, cloud technologies, and software design to help students on their journey to becoming industry ready technology leaders of the future. Let's get started. Welcome to Oracle Academy, tech chat where we discuss how Oracle Academy prepares the next generation workforce.
00:00:34:17 - 00:00:57:14
Unknown
I’m your host, Alba Joven. And in this episode, I'm joined by Javier de la Torre Medina, an Autonomous database specialist at Oracle. Today, Javier and I will be talking about how to become an AI data engineer. Welcome Javier. Thank you very much Alba a pleasure to have you today here on this nice chat. Javier before we dive in
00:00:57:15 - 00:01:20:15
Unknown
Can you tell us a little bit about what's your background and what's your role at Oracle? Sure. So I have it already at Oracle 13 years ago so when I when I joined and having working a lot of positions and all of the data related. So I'm working as a big data specialist, a NoSQL specialist, Oracle database specialist.
00:01:20:17 - 00:01:44:11
Unknown
So always having, working to, to help customers to build data architectures and solution which has the best approach. So to help them to, get the best value of the data. So normally my day, I do workshops to show the technology and demos to show it live, architecture diagram. I also I do a lot of proof of concepts.
00:01:44:13 - 00:02:07:17
Unknown
That's to, to help them to, to build the solution that their looking for. So at the end, I always been, data related, but now I have a special focus with, autonomous database, which is our, product right now in the, in the markets. I understand that you are an expert in, you know, autonomous database for those who may not be familiar what exactly it is and why is such a game changer?
00:02:07:19 - 00:02:35:15
Unknown
They are markets perception that Oracle is difficult. And you need a lot of gears and expertise. In order to start using this technology. And this is our game changer because we provide the Oracle database as so some access, an ERP or a CRM. But for the technology point of view it’s access, because the idea of the goal of Oracle is you can start working directly with the data and forget about everything.
00:02:35:17 - 00:02:58:05
Unknown
So Oracle automatically is in charge of putting the data is in charge and doing backups is in charge of doing all this kind of, of activities. And that's why I suddenly changer. Because so also one of the good thing is that there is a lot of configuration already done. So the idea is that you don't need to be an expert in Oracle in Uni in order to start using it.
00:02:58:07 - 00:03:18:07
Unknown
Even if you left the university tomorrow, you can start using, create great applications, AI application that we'll talk later about it in a matter of minutes. And this is where the business see value because even in the, in the AI space. So we see that every day this a new something is changing or something is happening.
00:03:18:09 - 00:03:40:11
Unknown
So to be up to date or to be able to adapt these new changes into your application or the business is very important. So that's why is the key difference from from there or a game changer for the autonomous database. Also, one of the key things is that, we're going to talk about this concept of data engineer.
00:03:40:13 - 00:04:00:16
Unknown
So I think normally when we talk about the Oracle database, we are talking about database administrators. But this concept I think is one to disappear too. And data engineers. But the good things or if someone is a DBA and is listening to us is that the knowledge to do have is still is great because you need to work with the data you need to create.
00:04:00:18 - 00:04:23:09
Unknown
Move the data you know the form of the data. So on this value you can use it is still as a data engineer there. But you can forget about all these tedious talks about patching, backups, upgrades or all these things that doesn’t provide value. And normally it's a stopper to to innovate on the DBAs or not, the data engineers can, benefit about all these new capabilities.
00:04:23:09 - 00:04:42:22
Unknown
The autonomous database is going to provide. You have mentioned the role of AI data engineer quite a bit. Can you explain how Oracle has achieved this role? So before going into the AI engineer, let me explain. What is a data engineer? If someone is coming from a DBA or have seen Oracle has only a database point of view.
00:04:43:00 - 00:05:03:03
Unknown
So I was mentioning. So one of the goal of the Oracle with the autonomous database is to focus only on the data, which is really important. But also there is a concept that we call in the autonomous database is the convert data model. So that means that we can work with any datatype. We can work with Json, we can work with data spatial.
00:05:03:09 - 00:05:26:23
Unknown
We can work with a relational. We can work with graph data. It doesn't matter. This is very important because many companies want to be a data driven company but become they become data movement company. So they move data from one product to another product. And then when the business came, I need a report by one yesterday I need to buy yesterday.
00:05:26:23 - 00:05:52:01
Unknown
No, there was too late. No. And this is one of the benefits that Oracle has to to be able to avoid to move data. Unless you don't need it. But also we have, great graphical interface which is called Data Studio. So I was mentioning you before that even if you left the university, today, tomorrow you can start using the autonomous database and you can become a data engineer.
00:05:52:03 - 00:06:13:18
Unknown
And this is because this suite of tools which are included for free, it is something very, very important with the autonomous database, allow us to go through the life cycle of the data engineer. So the lifecycle of a data engineer has three main stages, which is load data from a source of data transform the data will be data quality and reach it and so on.
00:06:14:00 - 00:06:36:16
Unknown
And then to serve it as maybe we need to serve it to a BI person who has to, prepare a report for their business. Maybe you want to serve it with an internal application because we need it for HR, marketing or so on, or even including application for third parties. So this is really very important. And all these three stages are very important from a data engineer.
00:06:36:16 - 00:07:06:04
Unknown
And you can achieve it very easily with a graphical interface which is called the the, the data studio. And what we say the I know with artificial intelligence. AI is transforming every industry. But what exactly is an AI data engineer. So have been talking about data engineer. So the AI engineer go a bit farther now, something that, I see, and I hear a lot is like a AI is not going to take your job, but someone who will use it.
00:07:06:04 - 00:07:28:18
Unknown
Wait. So this is very, very, very important because this is going to boost productivity. But even if it boost productivity is very important to check what the AI is suggesting because it's not going to do all the job for for us. So this is very, very important. And also in the data studio, we have included, a lot of AI features that anybody can can use.
00:07:28:20 - 00:07:49:00
Unknown
But something that I would like to highlight is that all the AI features that are included are open. So it means that I can work with any level in the market. You know that tomorrow, today, open AI is really cool, but tomorrow we have a three and then we have the IPsec and they're everywhere. Is, is is changing now.
00:07:49:02 - 00:08:10:01
Unknown
So something which is very important is to say, okay, from a business point of view, I want to take this technology and use it as soon as possible. But I don't want to change all the processes that I have underneath. If you have to adapt or develop something every day, you will never get anything to production or anything useful no
00:08:10:03 - 00:08:26:17
Unknown
And that's why in the database we are able to connect to any, and we are able to suggest or to work in order to improve this kind of pipelines or data load and all this kind of features that we like to highlight, you know, so this is very important.
00:08:26:19 - 00:08:50:17
Unknown
I mean when we boost their productivity. But then the AI engineer go a bit farther because also on top of that we have the load transform on serve that I was mentioning before. The AI can do more things. One thing is to create synthetic data or fake data. Imagine that they have to create an application for an internal marketing department.
00:08:50:19 - 00:09:14:23
Unknown
So I can use AI to generate the data model to generate the fake data. And the developer going to start working tomorrow directly. So this boost the productivity I don't have to worry too about, all these things are going to be a quickie. And also something that we can use in the autonomous database, which is very popular easily. We can work with that vector, database because the vector is a datatype is inside the autonomous database.
00:09:15:01 - 00:09:40:22
Unknown
So, anybody is not familiar with the vector is that we provide is the information we provide to the NLM in order to provide, more, business related, answers. For example, we have PDFs, I have Excel. I have a lot of information, pictures that I want my NLM to help me in my business so I can do it directly inside of the autonomous database in a very easy, very easy way.
00:09:41:00 - 00:10:02:02
Unknown
So that's why from I didn't AI different area point of view is generating fake fake data or synthetic data. I can go with vector, which is very important. I don't need to move the data inside or outside to different system, which is the converge capabilities that I mentioned before. And also I can create production ready applications in a matter of, of of minutes.
00:10:02:04 - 00:10:24:19
Unknown
So we have also a great capability which is select AI in which we can use natural language to resolve business queries. And this is, compared with a competition. And we have something which is really nice or very cool. And this is because the select AI feature is inside of the Oracle database is in the heart of the Oracle database.
00:10:24:21 - 00:10:44:11
Unknown
So for example, if you go and talk with ChatGPT is outside. We didn't know anything about your data. You know, tell me. Give me a recommendation about what you are what I'm looking for. But once you are inside of the Oracle database, the select AI the SLM is able to know everything about your business.
00:10:44:13 - 00:11:06:15
Unknown
But on top of that. So there are two key things, that are very important. One is the integration with all the Oracle ecosystem. It's very easy to create an application. For example, I created very simple application. Yes, to show the potential, although it is a bit useless, but I use my Apple Watch and Siri to talk with my autonomous database.
00:11:06:21 - 00:11:27:23
Unknown
So even though I'm, it seems like I'm talking with Siri at the end, the Apple Watch is sending a Rest API to select the AI SQL statement. So, create these kinds of things. It's our, one hour effort or less. Okay. So, this is very, very powerful. And I think it's more complicated to configure this. The Apple watch.
00:11:27:23 - 00:11:47:03
Unknown
Than doing the rest API. So this is give the feeling to our customers or anybody who is using it that I can have do it, do it very quickly. And if tomorrow instead of open AI I want to use work, I can do it very easily. That's another. My application is going to work the same.
00:11:47:06 - 00:12:07:07
Unknown
My rest API is going to be the same. And the second thing, which is very important is the security. Because even though you ask, tell me all the data, at the end you're going to generate a SQL statement. And if your user doesn't have the data privileges to read all the data, the L&M is not that an issue.
00:12:07:08 - 00:12:37:19
Unknown
All the information is secure by by default, because I have seen customers that for example, they build their own, the let's say their own, they put everything in a black box ChatGPT and then it's working. But they have a concern. What happens if I try to get information, imagine my medical or healthcare information about another user. You are never 100% sure that you can do some hacking to get this information.
00:12:37:21 - 00:13:00:10
Unknown
So at the end, what they do is they build another L&M to supervise the response. Anything after an L&M to supervise the second L&M. So everything became very, very, very complicated. But if you have the security by default it you can forget about all these things and make it applications, really to do all this kind of things, which is really powerful.
00:13:00:12 - 00:13:22:15
Unknown
So that's why I think that the AI data engineer with Autonomous database has all these benefits to create value in matter of, of minutes for the, for the business. So this is super interesting for those, listening who want to get this started in this file, where can anyone for our audience go to learn about how to become an AI data engineer?
00:13:22:17 - 00:13:42:06
Unknown
I think one of the key resources, because we are talking about many features about the Oracle Autonomous Database to be able to work with any kind of data type or to create the rest of the services and so on, so that we have a web page which is called Live Loves this. Live loves our Oracle, a free training.
00:13:42:06 - 00:14:08:16
Unknown
So we have more than 1000 trainings there for free, in which even you can have a your own sandbox to play, to play around. Also very important that even if you want to play on your local laptop. So we have a Docker image. So the stuff working in the laptop, even if you are interested to do this kind of things no and from there there are many live loves that will be interesting, like, the basics of the autonomous database, how it works, how to create one.
00:14:08:18 - 00:14:30:08
Unknown
We have live loves about Data studio. How to start with the graphical interface, how the looks splice and so on. Maybe then I will do something about, select AI, I can connect to another limb or I can create my application. How can create the fresh data services on top of that. So these are then key components that I will do it.
00:14:30:10 - 00:14:58:19
Unknown
And then to become the real AI engineer task we talk the vector how to work directly inside of the Oracle database. And then something that I highly recommend is to use Apex or Application Express, which is local application development. So I can create applications without the need of code. And even we have already built applications that you can use as for example, Lap chat to the first use cases that anybody wants to use.
00:14:58:21 - 00:15:24:21
Unknown
I want to chat with my data. I didn't want to use a black box or to write a strange commands and a create an application, have an easy way. So with this free live lab, you can have it in your laptop very, very easily and start working with, select AI under business. And then also we have another select the AI or support, which is a bit more advanced because the first business application is I want to chat with my data.
00:15:24:23 - 00:15:46:23
Unknown
And the second customer requirement is I want to create an application for my business user. For example, I know that nobody's reading the documentation. I will improve myself there. So let me let me create an application in which anybody can ask questions about the documentation to resolve any issues to to reduce the number of tickets on and so on.
00:15:47:01 - 00:16:09:01
Unknown
So this is also, done in a, in a free live love using vector. So we have the PDFs, the vector and then the L&M. And in a few clicks you can have all this, this work. So I recommend to go to this live love of application. You can look it in Google do it. You will see it because it's very, very nice, very simple, very intuitive with a lot of videos.
00:16:09:04 - 00:16:34:14
Unknown
So anybody can can follow it. And we have some angle. Something which is very important is we don't assume that you have some knowledge previous and knowledge, which is very, very, very important. Well, this has been such, inspo conversation before we will run up, my final question will be if you could give one piece of advice to faculty or students, what would it be?
00:16:34:16 - 00:16:54:21
Unknown
Something that I see. Well, now that everything is changing very fast. So we see a lot of products, a lot of AI, a lot of things there in the in the market are going on, which is very basic, but something that didn't change are the fundamentals a good data strategy is very important to consume any kind of AI data.
00:16:54:23 - 00:17:17:22
Unknown
And this maybe the name has changed, but if you go, look for a different name is the same concept of the same architecture, the same best practices. So my recommendation will be learn about all these principles, because I don't think it's going to change is true that, for example, the cloud has more flexibility, will be more cost efficient, but the architecture is not going to it's not going to change.
00:17:17:22 - 00:17:40:12
Unknown
No. So this is the the first thing. And the second thing SQL is really powerful. We see as a standard everywhere. So SQL for Json, sequel for relational SQL, for graph everywhere. So with two key things could be the main components I would recommend to anybody looking for to start to build on this kind of things and then going for this kind of live love
00:17:40:12 - 00:18:01:05
Unknown
So you want to became an AI data engineer. Ofcourse I think those were a great piece of advice. So a big thank you to have here for joining me on the podcast today. Thank you very much. If you want to learn more about Oracle Academy, please check out our website and subscribe to the podcast. Thank you for listening.
00:18:01:07 - 00:18:13:10
Unknown
That wraps up this episode. Thanks for listening and stay tuned for the next Oracle Academy Tech Chat podcast.
38 episodes