Omar Khaled

Posts

Managing Null Values in Your Data Warehouse: Key Considerations

- December 24, 2023

When it comes to null values in your data warehouse, the decision to replace or handle them depends on the nature of your data and the requirements of your analytical processes. Explore these key considerations for effective null value management: Understand the Meaning of Null: Before making decisions on null values, grasp their meaning in your data. Null may signify a lack of information, an unknown value, or an intentional absence of data. Contextual understanding is crucial for informed decision-making. Use Default Values or Codes: Rather than opting for a generic placeholder like 'unknown,' leverage default values or codes with specific meanings. For instance, 'N/A' (Not Applicable) or 'Not Available' can effectively represent cases where data is genuinely missing. Consider Business Rules: Evaluate your business rules and requirements. Null values may be acceptable in some cases, carrying meaningful interpretations. For i...

Choosing Between Alternate Key (Business Key) and Surrogate Key for Foreign Keys in the Fact Table: A Guide

- December 22, 2023

When integrating a foreign key into the fact table, referencing the product dimension table, it's advisable to choose the surrogate key. The surrogate key, often an auto-incremented integer, serves as a unique identifier for each record in the dimension table. Benefits of Surrogate Keys: Stability for the Long Haul: Surrogate keys offer lasting stability. Unlike natural keys, they remain unchanged over time. In the dynamic world of source systems, alterations to the natural key of a product can pose challenges in your data warehouse. Surrogate keys act as a reliable shield against such complications. Optimized Performance: Size matters, even in the key domain. Surrogate keys, being compact and numeric, enhance storage and query processing efficiency. Opting for foreign key relationships based on smaller numeric values can significantly boost overall performance. Ensuring Consistency: Consistency is the backbone of a robust data warehouse. Surrogate keys play a crucial role in es...

SQL query execution order

- May 04, 2023

ليه لازم تفهم دة عشان لما ت Run حاجة زي دي تفهم ايه الغلط ________ : Query 1 SELECT first_name +' '+ last_name AS full_name FROM students ORDER BY full_name _____________________________ : Query 2 SELECT first_name +' '+ last_name AS full_name FROM students 'WHERE full_name = 'Omar Khaled ____________________________ ال Error هيجي من Query 2 ليه تعالي نحلل Query 1 الاول _____ اول حاجة في Query دي بتتنذف FROM بيشوف فين الداتا بعدها ال SELECT ف هنا هو شاف انه في حاجة اسمها full_name اتعملت في ال run time فا مش هيعترض اما يشوفها في ORDER BY الي هي اخر حاجة بتتنفذ هنا ___________ لكن في QUERY 2 اول حاجة في Query دي بتتنذف FROM بيشوف فين الداتا بعدها بينفذ ال WHERE هنا هيقول انا معنديش حاجة اسمها full_name لان الترتيب ان ال WHERE بتتنفذ بدري عن ال SELECT __________ الترتيب كالاتي FROM JOIN ON بيشوف هو الداتا فين WHERE GROUP BY HAVING - AGG بعدها يفلترها SELECT - DISTINCT + AGG ORDER BY TOP

What is data science

- January 06, 2023

Introduction Some people think that data science is the same thing as statistics. But it's not. Data science involves a lot more than just analyzing numbers. What is actually data science Data science is a field of study that uses data to solve problems. It's an interdisciplinary field that incorporates techniques and theories from many fields, including statistics, data mining, machine learning and pattern recognition. Data scientists are able to apply their knowledge of statistics and machine learning algorithms in order to analyze large amounts of information in order to find patterns that might not be obvious at first glance. Data scientists can work in many different fields, including business and finance. They're often employed by large companies that have a lot of data to analyze, like banks or insurance companies. They may also be employed by technology companies that specialize in collecting user-generated data such as Facebook or Google. A data scientist is an exp...

بداية مذاكرتك لل Data analysis

- June 18, 2022

شوفت كذا بوست عن ازاي تبدأ في مجال ال Data analysis ومحتاج تكون عارف ايه وهكذا عموما هحكي من خلال خبرتي البسيطة يعني وهيكون البوست مختلف كالعادة بما انه في ناس اتناولت الموضوع اكيد بشكل احسن مني ♥️ ___ لو جيت قولتلك مين هو ال Data analyst وخليتك تقرأ شوية هو بيعمل ايه وبيشتغل ازاي وازاي بيشتغل علي ال Data هتكون فاهم كويس جدا وناقصك بس ال tools ال tools الي هتعمل بيها analysis وتتعامل مع ال Data والي هتعمل بيها Dashboards وهكذا طب لو اديتك ال tools دي كلها في الاول وبقيت كويس في sql و python و مثلا power bi هتبقي حاسس انك جامد بس مش فاهم همسك الدنيا ازاي زي مثلا حد مذاكر سواقة كويس وفاهم الاشارات والمرايات وامتي ابص هنا وامتي احود والطرق لو ركب حاجة جامدة زي BMW m3 هيعرف يسوق حتي و ان كانت اول مرة يستخدم ال tool دي بل كمان هيستكشف حاجات جميلة جوا العربية تسهل عليه المشوار لكن لو حد مش فاهم سواقة هيسوق ال BM عادي مجرد ما يحط الفتيس علي ال D ويحرك الدركسيون بس هو مش فاهم رايح لفين ولا امتي ابص في المراية ولا حاجة ____ دة شبه الي حصل ليا اول ما بدأت اخدت اول مسارين من منحة Udacity تبع Egyp...

Search This Blog

Omar Khaled

Posts

A Comprehensive Guide to CSV Files vs. Parquet Files in PySpark

Managing Null Values in Your Data Warehouse: Key Considerations

Choosing Between Alternate Key (Business Key) and Surrogate Key for Foreign Keys in the Fact Table: A Guide

SQL query execution order

What is data science

بداية مذاكرتك لل Data analysis

SQL query execution order

Managing Null Values in Your Data Warehouse: Key Considerations

بداية مذاكرتك لل Data analysis

اعمل ملخصات لنفسك بطريقة سهلة

What is data science

Choosing Between Alternate Key (Business Key) and Surrogate Key for Foreign Keys in the Fact Table: A Guide

A Comprehensive Guide to CSV Files vs. Parquet Files in PySpark