ﺎﻫﺍﺪﻧﺎﭘ ﺞﯾﺍﺭ ﯼﺎﻫﺎﻄﺧ ﻝﺎﮑﺷﺍ ﻊﻓﺭ ﯼﺍﺮﺑ ﻩﺩﺍﺩ ﺪﻨﻤﺸﻧﺍﺩ ﯼﺎﻤﻨﻫﺍﺭ
.ﺪﯿﻨﮐ ﯽﺳﺭﺮﺑ ﻥﻮﺘﯾﺎﭘ ﻩﺩﺍﺩ ﻢﻠﻋ ﯼﺭﺎﮐ ﯼﺎﻫﺵﺩﺮﮔ ﺭﺩ ﺎﻫﺍﺪﻧﺎﭘ ﯼﺎﻫﺎﻄﺧ ﻦﯾﺮﺗﻝﻭﺍﺪﺘ
.ﺪﯾﺭﺍﺪﻧ ﺍﺭ ﺵﺭﺎﻈﺘﻧﺍ ﻪﮐ ﺪﻧﻮﺷﯽﻣ ﺮﻫﺎﻇ ﯽﻧﺎﻣﺯ ﺖﺳﺭﺩ ﺎﻫﺎﻄﺧ ﺯﺍ ﯽﺧﺮﺑ ﻪﮐ ﺪﯿﻧﺍﺩﯽ
.ﺖﺧﺍﺩﺮﭘ ﻢﯿﻫﺍﻮﺧ ﺍﺪﻧﺎﭘ ﺞﯾﺍﺭ ﯼﺎﻫﺎﻄﺧ ﻦﯿﻨﭼ ﻪﺑ ﻩﺩﺎﺳ ﯼﺎﻫ ﻝﺎﺜﻣ ﺎﺑ ﺎﻣ ، ﻪﻟﺎﻘﻣ ﻦﯾﺍ ﺭﺩ .ﺪ
ﻪﺟﻮﺗ: ▶️ ﯽﻣ ﺮﮔﺍﮎﻮﺑ ﺕﻮﻧ ﺎﺠﻨﯾﺍ ﺭﺩ ،ﺪﯿﻨﯿﺒﺑ ﻥﺎﮑﻣ ﮏﯾ ﺭﺩ ﺍﺭ ﺎﻫﺪﮐ ﻪﻤﻫ ﺪﯿﻫﺍ
:ﻢﯿﻨﮐ ﻉﻭﺮﺷ ﺯﺎﯿﻧ ﺩﺭﻮﻣ ﯼﺎﻫ ﻪﻧﺎﺨﺑﺎﺘﮐ ﻥﺩﺮﮐ ﺩﺭﺍﻭ ﺎﺑ ﺪﯿﯾﺎﯿﺑ
import pandas as pd
import numpy as np
DataFrame ﻡﺎﻏﺩﺍ ﺪﯿﻠﮐ ﯽﮕﻨﻫﺎﻤﻫﺎﻧ .1
.ﺪﯾﻮﺷ ﯽﻣ ﻪﺟﺍﻮﻣ ﻥﺁ ﺎﺑ ﻪﮐ ﺖﺳﺍ ﯽﯾﺎﻫﺎﻄﺧ ﻦﯾﺮﺗ ﺞﯾﺍﺭ ﺯﺍ ﯽﮑﯾ ﻩﺩﺍﺩ ﯼﺎﻫ ﺏﺎﻗ ﻡﺎﻏﺩﺍ ﻡﺎﮕﻨﻫ
:ﺪﺷ ﺪﯿﻫﺍﻮﺧ ﻪﺟﺍﻮﻣ KeyError ﯼﺎﻨﺜﺘﺳﺍ ﮏﯾ ﺎﺑ ،ﺪﯿﻨﮐ ﻡﺎﻏﺩﺍ ﺪﯿﻫﺍﻮﺨﺑ ﺮﮔﺍ .("CustomerID"
# Create sample dataframes
sales_df = pd.DataFrame({
'customer_id': [101, 102, 103, 104],
'sale_amount': [1500, 2300, 1800, 3200]
})
customer_df = pd.DataFrame({
'CustomerID': [101, 102, 103, 105], # Note the different column name and slightly different data
'customer_name': ['Alice', 'Bob', 'Charlie', 'Eve']
})
try:
# This will raise an error
merged_df = sales_df.merge(customer_df, left_on='customer_id', right_on='customer_id')
except KeyError as e:
print("KeyError:", e)
:ﺪﻨﻨﮐ ﺍﺪﯿﭘ ﺍﺭ ﻖﺒﻄﻨﻣ ﯼﺎﻫ ﻥﻮﺘﺳ ﻡﺎﻧ ﺪﻨﻧﺍﻮﺗ ﯽﻤﻧ ﺎﻫﺍﺪﻧﺎﭘ ﻪﮐ ﺪﻫﺩ ﯽﻣ ﺥﺭ ﻞﯿﻟﺩ ﻦﯾﺍ ﻪﺑ ﺎ
KeyError: 'customer_id'
.ﺪﯿﻫﺩ ﻡﺎﻧ ﺮﯿﯿﻐﺗ ﻡﺍﻮﻗ ﯼﺍﺮﺑ ﺯﺎﯿﻧ ﺕﺭﻮﺻ ﺭﺩ ﺍﺭ ﺎﻫ ﻥﻮﺘﺳ ﺲﭙﺳ ﻭ .ﺪﯿﻨﮐ ﯽﺳﺭﺮﺑ df.
:ﺪﯿﻨﮐ ﺺﺨﺸﻣ ﺖﺣﺍﺮﺻ ﻪﺑ - ﺎﻫﻥﻮﺘﺳ ﺢﯿﺤﺻ ﯼﺎﻫﻡﺎﻧ - ﺪﻧﻮﺷ ﻡﺎﻏﺩﺍ ﺎﻬﻧﺁ ﯼﻭﺭ ﻪﮐ ﺍﺭ
merged_df = sales_df.merge(customer_df, left_on='customer_id', right_on='CustomerID')
:ﺪﻫﺪﺑ ﺎﻤﺷ ﻪﺑ ﺪﯾﺎﺑ ﻦﯾﺍ
.ﺪﯿﻫﺩ ﻡﺎﻧ ﺮﯿﯿﻐﺗ "customer_id" ﻪﺑ ﯼﺭﺎﮔﺯﺎﺳ ﯼﺍﺮﺑ ﺍﺭ "CustomerID" ﻥﻮﺘﺳ ﻢﻨﮐ ﯽﻣ ﺩﺎﻬﻨ
.ﺪﻧﻮﺷﯽﻣ ﺮﻫﺎﻇ ﻩﺪﺷ ﻡﺎﻏﺩﺍ ﻩﺩﺍﺩ ﺏﺎﻗ ﺭﺩ ،ﺩﺭﺍﺩ ﺩﻮﺟﻭ ﻩﺩﺍﺩ ﺏﺎﻗ ﻭﺩ ﺮﻫ ﺭﺩ ﻪﮐ ﯼﺮﺘﺸﻣ
.ﺩﻮﺷ ﻪﺘﻓﺮﮔ ﺮﻈﻧ ﺭﺩ ﻩﺩﺍﺩ ﯼﺎﻫ ﻩﺩﺍﺩ ﻭﺩ ﺮﻫ ﺭﺩ ﺩﻮﺟﻮﻣ ﯼﺎﻫﺪﯿﻠﮐ ﺩﺎﺤﺗﺍ ﺎﺗ ﺪﯿﻨﮐ ﻩﺩﺎﻔﺘﺳﺍ ﯽ
ﺕﺎﯿﻠﻤﻋ ﺭﺩ ﻂﻠﺘﺨﻣ ﯼﺎﻫ ﻩﺩﺍﺩ ﻉﺍﻮﻧﺍ .2
.ﺪﻨﺘﺴﻫ ﺐﺳﺎﻨﻣ ﻉﻮﻧ ﺯﺍ ﻪﮐ ﺪﯾﻮﺷ ﻦﺌﻤﻄﻣ ﻭ ﺪﯿﻨﮐ ﯽﺳﺭﺮﺑ ﺍﺭ ﻒﻠﺘﺨﻣ ﯼﺎﻫ ﻥﻮﺘﺳ ﯼﺎﻫ ﻩﺩﺍﺩ ﻉﺍﻮﻧ
:ﺖﺳﺍ "NA" ﻪﻠﻤﺟ ﺯﺍ ، ﺎﻫ ﻪﺘﺷﺭ ﯼﻭﺎﺣ ﺎﻣ "ﺵﺯﺭﺍ" ﻥﻮﺘﺳ ﻪﻧﻮﮕﭼ ﻪﮐ ﺪﯿﻨﮐ ﻪﺟﻮﺗ ﻢﯿﻧﺰﺑ ﻝﺎﺜﻣ
# Create sample dataframe with mixed types
mixed_df = pd.DataFrame({
'value': ['100', '200', 'NA', '400', '500']
})
try:
# This will raise an error
result = mixed_df['value'].mean()
except TypeError as e:
print("TypeError:", e)
:ﻢﯿﻨﮐ ﯼﺮﯿﮔﻦﯿﮕﻧﺎﯿﻣ ﺍﺭ ﺎﻫﻪﺘﺷﺭ ﻢﯿﻧﺍﻮﺗﯽﻤﻧ ﺍﺮﯾﺯ ﺪﻨﮐﯽﻣ ﺩﺎﺠﯾﺍ Typ
TypeError: Could not convert string '100200NA400500' to numeric
.ﺪﯿﻫﺩ ﻡﺎﺠﻧﺍ ﻒﻄﻟ ﺎﺑ ﺍﺭ ﯼﺩﺪﻋ ﺮﯿﻏ ﺮﯾﺩﺎﻘﻣ ﺎﺗ ﺪﯿﻨﮐ ﻞﯾﺪﺒﺗ :ﺪﯾﺩ ﺪﯿﻫﺍﻮﺧ ،ﺪﯿﻨﮐ ﭖﺎﭼ ﺍﺭ SettingWithCopyWarning ﮏﯾ ﯼﺎﻤﻧ ، ﯽﭙﮐ ﮏﯾ ﯼﺎﺟ ﻪﺑ ﺪﯿﻨﮐ ﯽﻣ ﯽﻌﺳ ﻪﮐ ﯽﺘﻗﻭ ﺪﺘﻓﺍ ﯽﻣ ﻕﺎﻔ :ﻢﯿﻧﺰﺑ ﻩﺩﺎﺳ ﻝﺎﺜﻣ ﮏﯾ ﺪﯿﯾﺎﯿﺑ .ﺮﯿﺧ ﺎﯾ ﺩﺭﺍﺬﮔ ﯽﻣ ﺮﯿﺛﺄﺗ ﯽﻠﺻﺍ ﯼﺎﻫ ﻩﺩﺍﺩ ﯼﻭﺭ ﺕﺎﯿﻠﻤﻋ ﻦﯾﺍ .ﺪﯿﻧﺍﻮﺨﺑ ﺍﺭ SettingWithCopyWarning ﻭ ﻪﺨﺴﻧ ﮏﯾ ﻞﺑﺎﻘﻣ ﺭﺩ -A- View ﻥﺪﻧﺍﻮﺧ ، ﺭﺍﺪﺸﻫ :ﺪﯿﻨﮐ ﺩﺎﺠﯾﺍ ، ﺪﯿﻨﮐ ﯽﻣ ﺭﺎﮐ ﺩﻮﺧ ﯼﺎﻫ ﻩﺩﺍﺩ ﺯﺍ ﯼﺍ ﻪﻋﻮﻤﺠﻣ ﺮﯾﺯ ﺎﺑ ﻪﮐ ﯽﻣﺎﮕﻨﻫ .c .ﺪﻨﮐ ﯽﻣ ﺮﺸﺘﻨﻣ ﺎﻤﺷ ﺕﺎﺒﺳﺎﺤﻣ ﻖﯾﺮﻃ ﺯﺍ ﺍﺪﺻ ﯽﺑ ﺍﺭ NaN ﺮﯾﺩﺎﻘﻣ ﻂﻘﻓ - ﺪﻨﮑﻧ ﺩﺎﺠﯾﺍ ﯽﯾﺎﻄﺧ :ﺪﺷ ﺪﻫﺍﻮﺧ NAN ﻪﺑ ﺮﺠﻨﻣ NAN ﻪﺑ ﻁﻮﺑﺮﻣ ﻪﺒﺳﺎﺤﻣ ﺮﻫ ، ﺎﻣ ﻝﺎﺜﻣ ﺭﺩ :ﺩﺮﮐ ﺪﯿﻫﺍﻮﺧ ﺖﻓﺎﯾﺭﺩ ،ﺎﻫ np.nan ﻥﺁ ﻪﺑ ﯽﮔﺪﯿﺳﺭ ﻥﻭﺪﺑ :ﺪﯿﻨﮐ ﺎﻫﺭ ﻞﻣﺎﮐ ﺭﻮﻃ ﻪﺑ ﺍﺭ ﺎﻬﻧﺁ ﺎﯾ ﺪﯿﻨﮐ ﯽﺑﺎﯾ ﻥﻭﺭﺩ ﺍﺭ ﺎﻬﻧﺁ ،ﺪﯿﻨﮐ ﺮﭘ 0 ﺎﺑ ﺍﺭ NaN ﺮ :ﺪﻫﺩ ﻪﺋﺍﺭﺍ ﺍﺭ ﺮﯾﺯ ﺩﻮﺳ ﺵﺯﺭﺍ ﺪﯾﺎﺑ ﻦﯾﺍ .ﺪﻧﺭﺍﺪﻧ ﺖﻘﺑﺎﻄﻣ ﺎﻫ ﺺﺧﺎﺷ ﻪﮐ ﺩﻮﺷ ﯽﯾﺎﻫﺎﻄﺧ ﺎﯾ ﻩﺮﻈﺘﻨﻣﺮﯿﻏ ﺞﯾﺎﺘﻧ ﻪﺑ ﺮﺠﻨﻣ ﺪﻧﺍﻮﺗ ﯽﻣ ﻦﯾﺍ :ﺪﺷﺎﺑ ﻪﺘﺷﺍﺩ ﯼﺍ ﻩﺮﻈﺘﻨﻣ ﺮﯿﻏ ﺞﯾﺎﺘﻧ ﺖﺳﺍ ﻦﮑﻤﻣ ﻦﯾﺍ .ﺪﯿﻨﮐ ﯽﺳﺭﺮﺑ ﻞﻤﻋ ﺯﺍ ﻞﺒﻗ .ﺪﯿﻨﮐ ﻩﺩﺎﻔﺘﺳﺍ :ﺪﻫﺩ ﻪﺋﺍﺭﺍ ﺍﺭ ﺭﺎﻈﺘﻧﺍ ﺩﺭﻮﻣ ﯽﺟﻭﺮﺧ ﺪﯾﺎﺑ ﻦﯾﺍ .ﺖﺳﺍ ﺪﻣﺁﺭﺎﮐﺎﻧ ﮒﺭﺰﺑ ﯼﺎﻫ ﻩﺩﺍﺩ ﮒﺭﺰﺑ ﯼﺎﻫ ﻩﺩﺍﺩ ﺯﺍ ﻪﺨﺴﻧ ﻦﯾﺪﻨﭼ ﺩﺎﺠﯾﺍ .ﺖﺳﺍ ﮎﺮﺘﺸﻣ ﺩﺮﮑﻠﻤ :ﺖﺳﺍ ﻩﺪﺷ ﻩﺩﺭﻭﺁ ﻝﺎﺜﻣ ﮏﯾ ﺎﺠﻨﯾﺍ ﺭﺩ :ﺪﯿﻫﺩ ﻡﺎﺠﻧﺍ ﻥﺎﮑﻣﺍ ﺕﺭﻮﺻ ﺭﺩ ﺍﺭ ﯼﺍ ﻩﺮﯿﺠﻧﺯ ﺕﺎﯿﻠﻤﻋ ﻪﮐ ﺖﺳﺍ ﻦﯾﺍ ﺭﺎﮐ ﻦﯾﺍ ﻡﺎﺠﻧﺍ ﯼﺍﺮﺑ ﺮﺘ .ﺪﯿﻨﮐ ﺕﺭﺎﻈﻧ ﺍﺭ ﻥﻮﺘﺳ ﺮﻫ ﻪﻈﻓﺎﺣ ﺯﺍ ﻩﺩﺎﻔﺘﺳﺍ ﻥﺍﺰﯿﻣ()memory_usage ﺪﺘﻣ ﺯﺍ ﻩﺩﺎﻔﺘﺳﺍ ﺎﺑ :ﻪﻧﻮﮕﭼ ﻪﮐ ﻢﯾﺪﯾﺩ ﺎﻣ .ﻢﯾﺍﻪﺘﺧﺍﺩﺮﭘ ،ﺪﻨﺘﺴﻫ ﻪﺟﺍﻮﻣ ﺎﻫﻥﺁ ﺎﺑ ﺩﻮﺧ ﻪﻧﺍﺯﻭﺭ ﺭﺎﮐ ﺭ ? !ﮎﺭﺎﺒﻣ ﯼﺎﻫ ﻩﺩﺍﺩ ﻞﯿﻠﺤﺗ ﻭ ﻪﯾﺰﺠﺗ .ﺪﯿﺷﺎﺑ ﻩﺩﺮﮐ ﺍﺪﯿﭘ ﺍﺭ ﺪﯿﻔﻣ ﻦﯾﺍ ﻡﺭﺍﻭﺪﯿﻣﺍ .mixed_df['value'] = pd.to_numeric(mixed_df['value'], errors='coerce')
result = mixed_df['value'].mean()
ﻪﺠﯿﺘﻧ ﺮﯿﻐﺘﻣ ﺮﮔﺍ
300.0
3. DataFrame View ﯽﭙﮐ ﻞﺑﺎﻘﻣ ﺭﺩ (SettingWithCopyWarning)
# Create sample dataframe
data = pd.DataFrame({
'category': ['A', 'A', 'B', 'B', 'C'],
'value': [1, 2, 3, 4, 5]
})
# This will trigger a warning
subset_data = data[data['category'] == 'A']
subset_data['value'] = subset_data['value'] * 2
subset_data = data[data['category'] == 'A'].copy()
subset_data['value'] = subset_data['value'] * 2
ﺕﺎﺒﺳﺎﺤﻣ ﺭﺩ ﻥﺎﻧ ﺭﺎﺸﺘﻧﺍ .4
# Create sample dataframe with NaN values
finance_df = pd.DataFrame({
'revenue': [1000, 2000, np.nan, 4000],
'costs': [500, np.nan, 1500, 2000]
})
# This will give unexpected results
profit = finance_df['revenue'] - finance_df['costs']
0 500.0
1 NaN
2 NaN
3 2000.0
dtype: float64
profit = finance_df['revenue'].fillna(0) - finance_df['costs'].fillna(0)
0 500.0
1 2000.0
2 -1500.0
3 2000.0
dtype: float64
ﺺﺧﺎﺷ ﺯﺍﺮﺗ ﻪﺑ ﻁﻮﺑﺮﻣ ﻞﺋﺎﺴﻣ .5
# Create sample dataframes with different indices
df_1 = pd.DataFrame({'value': [1, 2, 3]}, index=['A', 'B', 'C'])
df_2 = pd.DataFrame({'value': [4, 5, 6]}, index=['B', 'C', 'D'])
try:
result = df_1['value'] + df_2['value']
except Exception as e:
print("Exception:", e)
A NaN
B 6.0
C 8.0
D NaN
Name: value, dtype: float64
df.index
ﺯﺍ ﻩﺩﺎﻔﺘﺳﺍ ﺎﺑ ﺍﺭ ﺩﻮﺧ ﺺﺧﺎﺷ ﺮﯾﺩﺎﻘfill_value
ﺮﺘﻣﺍﺭﺎﭘ ﺎﺑ .add()
ﺪﻨﻧﺎﻣ ﯽﯾﺎresult = df_1['value'].add(df_2['value'], fill_value=0)
A 1.0
B 6.0
C 8.0
D 6.0
Name: value, dtype: float64
ﮒﺭﺰﺑ ﯼﺎﻫ ﻩﺩﺍﺩ ﺎﺑ ﻪﻈﻓﺎﺣ ﻪﺑ ﻁﻮﺑﺮﻣ ﻞﺋﺎﺴﻣ .6
def processing_func():
# Create a large dataframe (this is a small example)
big_df = pd.DataFrame(np.random.randn(1000000, 10))
# Inefficient way (creates multiple copies)
processed_df = big_df
for col in big_df.columns:
processed_df = processed_df[processed_df[col] > 0]
return processed_df
def a_better_processing_func():
# Create a large dataframe (this is a small example)
big_df = pd.DataFrame(np.random.randn(1000000, 10))
# Efficient solution (chain operations)
mask = (big_df > 0).all(axis=1)
processed_df = big_df[mask]
return processed_df
ﻥﺪﯿﭽﯿﭘ