Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by Oracle Universtity and Oracle Corporation. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Oracle Universtity and Oracle Corporation or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Core AI Concepts – Part 2

12:42
 
Share
 

Manage episode 501184948 series 3560727
Content provided by Oracle Universtity and Oracle Corporation. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Oracle Universtity and Oracle Corporation or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.
In this episode, Lois Houston and Nikita Abraham continue their discussion on AI fundamentals, diving into Data Science with Principal AI/ML Instructor Himanshu Raj. They explore key concepts like data collection, cleaning, and analysis, and talk about how quality data drives impactful insights. AI for You: https://mylearn.oracle.com/ou/course/ai-for-you/152601/252500 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. ---------------------------------------------------------------- Episode Transcript:

00:00

Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started!

00:25

Lois: Hello and welcome to the Oracle University Podcast. I’m Lois Houston, Director of Innovation Programs with Oracle University, and with me today is Nikita Abraham, Team Lead: Editorial Services.

Nikita: Hi everyone! Last week, we began our exploration of core AI concepts, specifically machine learning and deep learning. I’d really encourage you to go back and listen to the episode if you missed it.

00:52

Lois: Yeah, today we’re continuing that discussion, focusing on data science, with our Principal AI/ML Instructor Himanshu Raj.

Nikita: Hi Himanshu! Thanks for joining us again. So, let’s get cracking! What is data science?

01:06

Himanshu: It's about collecting, organizing, analyzing, and interpreting data to uncover valuable insights that help us make better business decisions. Think of data science as the engine that transforms raw information into strategic action.

You can think of a data scientist as a detective. They gather clues, which is our data. Connect the dots between those clues and ultimately solve mysteries, meaning they find hidden patterns that can drive value.

01:33

Nikita: Ok, and how does this happen exactly?

Himanshu: Just like a detective relies on both instincts and evidence, data science blends domain expertise and analytical techniques. First, we collect raw data. Then we prepare and clean it because messy data leads to messy conclusions. Next, we analyze to find meaningful patterns in that data. And finally, we turn those patterns into actionable insights that businesses can trust.

02:00

Lois: So what you’re saying is, data science is not just about technology; it's about turning information into intelligence that organizations can act on. Can you walk us through the typical steps a data scientist follows in a real-world project?

Himanshu: So it all begins with business understanding. Identifying the real problem we are trying to solve. It's not about collecting data blindly. It's about asking the right business questions first. And once we know the problem, we move to data collection, which is gathering the relevant data from available sources, whether internal or external.

Next one is data cleaning. Probably the least glamorous but one of the most important steps. And this is where we fix missing values, remove errors, and ensure that the data is usable. Then we perform data analysis or what we call exploratory data analysis.

Here we look for patterns, prints, and initial signals hidden inside the data. After that comes the modeling and evaluation, where we apply machine learning or deep learning techniques to predict, classify, or forecast outcomes. Machine learning, deep learning are like specialized equipment in a data science detective's toolkit. Powerful but not the whole investigation.

We also check how good the models are in terms of accuracy, relevance, and business usefulness. Finally, if the model meets expectations, we move to deployment and monitoring, putting the model into real world use and continuously watching how it performs over time.

03:34

Nikita: So, it’s a linear process?

Himanshu: It's not linear. That's because in real world data science projects, the process does not stop after deployment. Once the model is live, business needs may evolve, new data may become available, or unexpected patterns may emerge.

And that's why we come back to business understanding again, defining the questions, the strategy, and sometimes even the goals based on what we have learned. In a way, a good data science project behaves like living in a system which grows, adapts, and improves over time. Continuous improvement keeps it aligned with business value.

Now, think of it like adjusting your GPS while driving. The route you plan initially might change as new traffic data comes in. Similarly, in data science, new information constantly help refine our course. The quality of our data determines the quality of our results.

If the data we feed into our models is messy, inaccurate, or incomplete, the outputs, no matter how sophisticated the technology, will be also unreliable. And this concept is often called garbage in, garbage out. Bad input leads to bad output.

Now, think of it like cooking. Even the world's best Michelin star chef can't create a masterpiece with spoiled or poor-quality ingredients. In the same way, even the most advanced AI models can't perform well if the data they are trained on is flawed.

05:05

Lois: Yeah, that's why high-quality data is not just nice to have, it’s absolutely essential. But Himanshu, what makes data good?

Himanshu: Good data has a few essential qualities. The first one is complete. Make sure we aren't missing any critical field. For example, every customer record must have a phone number and an email. It should be accurate. The data should reflect reality. If a customer's address has changed, it must be updated, not outdated. Third, it should be consistent. Similar data must follow the same format. Imagine if the dates are written differently, like 2024/04/28 versus April 28, 2024. We must standardize them.

Fourth one. Good data should be relevant. We collect only the data that actually helps solve our business question, not unnecessary noise. And last one, it should be timely. So data should be up to date. Using last year's purchase data for a real time recommendation engine wouldn't be helpful.

06:13

Nikita: Ok, so ideally, we should use good data. But that’s a bit difficult in reality, right? Because what comes to us is often pretty messy. So, how do we convert bad data into good data? I’m sure there are processes we use to do this.

Himanshu: First one is cleaning. So this is about correcting simple mistakes, like fixing typos in city names or standardizing dates.

The second one is imputation. So if some values are missing, we fill them intelligently, for instance, using the average income for a missing salary field. Third one is filtering. In this, we remove irrelevant or noisy records, like discarding fake email signups from marketing data. The fourth one is enriching. We can even enhance our data by adding trusted external sources, like appending credit scores from a verified bureau.

And the last one is transformation. Here, we finally reshape data formats to be consistent, for example, converting all units to the same currency. So even messy data can become usable, but it takes deliberate effort, structured process, and attention to quality at every step.

07:26

Oracle University’s Race to Certification 2025 is your ticket to free training and certification in today’s hottest technology. Whether you’re starting with Artificial Intelligence, Oracle Cloud Infrastructure, Multicloud, or Oracle Data Platform, this challenge covers it all! Learn more about your chance to win prizes and see your name on the Leaderboard by visiting education.oracle.com/race-to-certification-2025. That’s education.oracle.com/race-to-certification-2025.

08:10

Nikita: Welcome back! Himanshu, we spoke about how to clean data. Now, once we get high-quality data, how do we analyze it?

Himanshu: In data science, there are four primary types of analysis we typically apply depending on the business goal we are trying to achieve.

The first one is descriptive analysis. It helps summarize and report what has happened. So often using averages, totals, or percentages. For example, retailers use descriptive analysis to understand things like what was the average customer spend last quarter? How did store foot traffic trend across months?

The second one is diagnostic analysis. Diagnostic analysis digs deeper into why something happened. For example, hospitals use this type of analysis to find out, for example, why a certain department has higher patient readmission rates. Was it due to staffing, post-treatment care, or patient demographics?

The third one is predictive analysis. Predictive analysis looks forward, trying to forecast future outcomes based on historical patterns. For example, energy companies predict future electricity demand, so they can better manage resources and avoid shortages. And the last one is prescriptive analysis. So it does not just predict. It recommends specific actions to take.

So logistics and supply chain companies use prescriptive analytics to suggest the most efficient delivery routes or warehouse stocking strategies based on traffic patterns, order volume, and delivery deadlines.

09:42

Lois: So really, we’re using data science to solve everyday problems. Can you walk us through some practical examples of how it’s being applied?

Himanshu: The first one is predictive maintenance. It is done in manufacturing a lot. A factory collects real time sensor data from machines. Data scientists first clean and organize this massive data stream, explore patterns of past failures, and design predictive models.

The goal is not just to predict breakdowns but to optimize maintenance schedules, reducing downtime and saving millions. The second one is a recommendation system. It's prevalent in retail and entertainment industries. Companies like Netflix or Amazon gather massive user interaction data such as views, purchases, likes.

Data scientists structure and analyze this behavioral data to find meaningful patterns of preferences and build models that suggest relevant content, eventually driving more engagement and loyalty. The third one is fraud detection. It's applied in finance and banking sector.

Banks store vast amounts of transaction record records. Data scientists clean and prepare this data, understand typical spending behaviors, and then use statistical techniques and machine learning to spot unusual patterns, catching fraud faster than manual checks could ever achieve.

The last one is customer segmentation, which is often applied in marketing. Businesses collect demographics and behavioral data about their customers. Instead of treating all the customers same, data scientists use clustering techniques to find natural groupings, and this insight helps businesses tailor their marketing efforts, offers, and communication for each of those individual groups, making them far more effective.

Across all these examples, notice that data science isn't just building a model. Again, it's understanding the business need, reviewing the data, analyzing it thoughtfully, and building the right solution while helping the business act smarter.

11:44

Lois: Thank you, Himanshu, for joining us on this episode of the Oracle University Podcast. We can’t wait to have you back next week for part 3 of this conversation on core AI concepts, where we’ll talk about generative AI and gen AI agents. Nikita: And if you want to learn more about data science, visit mylearn.oracle.com and search for the AI for You course. Until next time, this is Nikita Abraham…

Lois: And Lois Houston signing off!

12:13

That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

  continue reading

130 episodes

Artwork
iconShare
 
Manage episode 501184948 series 3560727
Content provided by Oracle Universtity and Oracle Corporation. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Oracle Universtity and Oracle Corporation or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.
In this episode, Lois Houston and Nikita Abraham continue their discussion on AI fundamentals, diving into Data Science with Principal AI/ML Instructor Himanshu Raj. They explore key concepts like data collection, cleaning, and analysis, and talk about how quality data drives impactful insights. AI for You: https://mylearn.oracle.com/ou/course/ai-for-you/152601/252500 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. ---------------------------------------------------------------- Episode Transcript:

00:00

Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started!

00:25

Lois: Hello and welcome to the Oracle University Podcast. I’m Lois Houston, Director of Innovation Programs with Oracle University, and with me today is Nikita Abraham, Team Lead: Editorial Services.

Nikita: Hi everyone! Last week, we began our exploration of core AI concepts, specifically machine learning and deep learning. I’d really encourage you to go back and listen to the episode if you missed it.

00:52

Lois: Yeah, today we’re continuing that discussion, focusing on data science, with our Principal AI/ML Instructor Himanshu Raj.

Nikita: Hi Himanshu! Thanks for joining us again. So, let’s get cracking! What is data science?

01:06

Himanshu: It's about collecting, organizing, analyzing, and interpreting data to uncover valuable insights that help us make better business decisions. Think of data science as the engine that transforms raw information into strategic action.

You can think of a data scientist as a detective. They gather clues, which is our data. Connect the dots between those clues and ultimately solve mysteries, meaning they find hidden patterns that can drive value.

01:33

Nikita: Ok, and how does this happen exactly?

Himanshu: Just like a detective relies on both instincts and evidence, data science blends domain expertise and analytical techniques. First, we collect raw data. Then we prepare and clean it because messy data leads to messy conclusions. Next, we analyze to find meaningful patterns in that data. And finally, we turn those patterns into actionable insights that businesses can trust.

02:00

Lois: So what you’re saying is, data science is not just about technology; it's about turning information into intelligence that organizations can act on. Can you walk us through the typical steps a data scientist follows in a real-world project?

Himanshu: So it all begins with business understanding. Identifying the real problem we are trying to solve. It's not about collecting data blindly. It's about asking the right business questions first. And once we know the problem, we move to data collection, which is gathering the relevant data from available sources, whether internal or external.

Next one is data cleaning. Probably the least glamorous but one of the most important steps. And this is where we fix missing values, remove errors, and ensure that the data is usable. Then we perform data analysis or what we call exploratory data analysis.

Here we look for patterns, prints, and initial signals hidden inside the data. After that comes the modeling and evaluation, where we apply machine learning or deep learning techniques to predict, classify, or forecast outcomes. Machine learning, deep learning are like specialized equipment in a data science detective's toolkit. Powerful but not the whole investigation.

We also check how good the models are in terms of accuracy, relevance, and business usefulness. Finally, if the model meets expectations, we move to deployment and monitoring, putting the model into real world use and continuously watching how it performs over time.

03:34

Nikita: So, it’s a linear process?

Himanshu: It's not linear. That's because in real world data science projects, the process does not stop after deployment. Once the model is live, business needs may evolve, new data may become available, or unexpected patterns may emerge.

And that's why we come back to business understanding again, defining the questions, the strategy, and sometimes even the goals based on what we have learned. In a way, a good data science project behaves like living in a system which grows, adapts, and improves over time. Continuous improvement keeps it aligned with business value.

Now, think of it like adjusting your GPS while driving. The route you plan initially might change as new traffic data comes in. Similarly, in data science, new information constantly help refine our course. The quality of our data determines the quality of our results.

If the data we feed into our models is messy, inaccurate, or incomplete, the outputs, no matter how sophisticated the technology, will be also unreliable. And this concept is often called garbage in, garbage out. Bad input leads to bad output.

Now, think of it like cooking. Even the world's best Michelin star chef can't create a masterpiece with spoiled or poor-quality ingredients. In the same way, even the most advanced AI models can't perform well if the data they are trained on is flawed.

05:05

Lois: Yeah, that's why high-quality data is not just nice to have, it’s absolutely essential. But Himanshu, what makes data good?

Himanshu: Good data has a few essential qualities. The first one is complete. Make sure we aren't missing any critical field. For example, every customer record must have a phone number and an email. It should be accurate. The data should reflect reality. If a customer's address has changed, it must be updated, not outdated. Third, it should be consistent. Similar data must follow the same format. Imagine if the dates are written differently, like 2024/04/28 versus April 28, 2024. We must standardize them.

Fourth one. Good data should be relevant. We collect only the data that actually helps solve our business question, not unnecessary noise. And last one, it should be timely. So data should be up to date. Using last year's purchase data for a real time recommendation engine wouldn't be helpful.

06:13

Nikita: Ok, so ideally, we should use good data. But that’s a bit difficult in reality, right? Because what comes to us is often pretty messy. So, how do we convert bad data into good data? I’m sure there are processes we use to do this.

Himanshu: First one is cleaning. So this is about correcting simple mistakes, like fixing typos in city names or standardizing dates.

The second one is imputation. So if some values are missing, we fill them intelligently, for instance, using the average income for a missing salary field. Third one is filtering. In this, we remove irrelevant or noisy records, like discarding fake email signups from marketing data. The fourth one is enriching. We can even enhance our data by adding trusted external sources, like appending credit scores from a verified bureau.

And the last one is transformation. Here, we finally reshape data formats to be consistent, for example, converting all units to the same currency. So even messy data can become usable, but it takes deliberate effort, structured process, and attention to quality at every step.

07:26

Oracle University’s Race to Certification 2025 is your ticket to free training and certification in today’s hottest technology. Whether you’re starting with Artificial Intelligence, Oracle Cloud Infrastructure, Multicloud, or Oracle Data Platform, this challenge covers it all! Learn more about your chance to win prizes and see your name on the Leaderboard by visiting education.oracle.com/race-to-certification-2025. That’s education.oracle.com/race-to-certification-2025.

08:10

Nikita: Welcome back! Himanshu, we spoke about how to clean data. Now, once we get high-quality data, how do we analyze it?

Himanshu: In data science, there are four primary types of analysis we typically apply depending on the business goal we are trying to achieve.

The first one is descriptive analysis. It helps summarize and report what has happened. So often using averages, totals, or percentages. For example, retailers use descriptive analysis to understand things like what was the average customer spend last quarter? How did store foot traffic trend across months?

The second one is diagnostic analysis. Diagnostic analysis digs deeper into why something happened. For example, hospitals use this type of analysis to find out, for example, why a certain department has higher patient readmission rates. Was it due to staffing, post-treatment care, or patient demographics?

The third one is predictive analysis. Predictive analysis looks forward, trying to forecast future outcomes based on historical patterns. For example, energy companies predict future electricity demand, so they can better manage resources and avoid shortages. And the last one is prescriptive analysis. So it does not just predict. It recommends specific actions to take.

So logistics and supply chain companies use prescriptive analytics to suggest the most efficient delivery routes or warehouse stocking strategies based on traffic patterns, order volume, and delivery deadlines.

09:42

Lois: So really, we’re using data science to solve everyday problems. Can you walk us through some practical examples of how it’s being applied?

Himanshu: The first one is predictive maintenance. It is done in manufacturing a lot. A factory collects real time sensor data from machines. Data scientists first clean and organize this massive data stream, explore patterns of past failures, and design predictive models.

The goal is not just to predict breakdowns but to optimize maintenance schedules, reducing downtime and saving millions. The second one is a recommendation system. It's prevalent in retail and entertainment industries. Companies like Netflix or Amazon gather massive user interaction data such as views, purchases, likes.

Data scientists structure and analyze this behavioral data to find meaningful patterns of preferences and build models that suggest relevant content, eventually driving more engagement and loyalty. The third one is fraud detection. It's applied in finance and banking sector.

Banks store vast amounts of transaction record records. Data scientists clean and prepare this data, understand typical spending behaviors, and then use statistical techniques and machine learning to spot unusual patterns, catching fraud faster than manual checks could ever achieve.

The last one is customer segmentation, which is often applied in marketing. Businesses collect demographics and behavioral data about their customers. Instead of treating all the customers same, data scientists use clustering techniques to find natural groupings, and this insight helps businesses tailor their marketing efforts, offers, and communication for each of those individual groups, making them far more effective.

Across all these examples, notice that data science isn't just building a model. Again, it's understanding the business need, reviewing the data, analyzing it thoughtfully, and building the right solution while helping the business act smarter.

11:44

Lois: Thank you, Himanshu, for joining us on this episode of the Oracle University Podcast. We can’t wait to have you back next week for part 3 of this conversation on core AI concepts, where we’ll talk about generative AI and gen AI agents. Nikita: And if you want to learn more about data science, visit mylearn.oracle.com and search for the AI for You course. Until next time, this is Nikita Abraham…

Lois: And Lois Houston signing off!

12:13

That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

  continue reading

130 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play