Exercise.1 - 77: Accuracy 구하기¶
For loop, ifstatement를 이용하여 accuracy를 구하는 방법에 대해 연습해봅니다.
predictions = [0, 1, 0, 2, 1, 2, 0]
labels = [1, 1, 0, 0, 1, 2, 1]
n_correct = 0
for pred_idx in range(len(predictions)):
if predictions[pred_idx] == labels[pred_idx]:
n_correct += 1
accuracy = n_correct / len(predictions)
print(f'accuracy[%] = {accuracy*100:.2f}%')
accuracy[%] = 57.14%
Exercise.1 - 78: Confusion Vector 구하기 (confusion matrix)¶
Exercise.1 - 77까지 배운 내용을 이용하여 confusion vector를 구하는 방법을 배워봅니다.
모델의 성능평가 지표
https://wikidocs.net/194087
Confusion matrix는 다음과 같은 네 가지 항목으로 구성됩니다.
True Positive (TP): 실제값이 Positive이고 모델이 예측한 값이 Positive인 경우
False Positive (FP): 실제값이 Negative이고 모델이 예측한 값이 Positive인 경우
False Negative (FN): 실제값이 Positive이고 모델이 예측한 값이 Negative인 경우
True Negative (TN): 실제값이 Negative이고 모델이 예측한 값이 Negative인 경우
Confusion Matrix - Accuracy (정확도)¶
Accuracy = (TP + TN) / (TP + TN + FP + FN)
모델이 예측한 결과와 실제 결과가 일치한 비율
predictions = [0, 1, 0, 2, 1, 2, 0]
labels = [1, 1, 0, 0, 1, 2, 1]
n_classes = -1
for label in labels:
if n_classes == None or label > n_classes:
n_classes = label
n_classes += 1
class_cnts, correct_cnt, confusion_vec = [], [], []
for _ in range(n_classes):
class_cnts.append(0)
correct_cnt.append(0)
confusion_vec.append(None)
for pred_idx in range(len(predictions)):
pred = predictions[pred_idx]
label = labels[pred_idx]
print(f'pred, label: {pred}, {label}')
class_cnts[label] += 1
if pred == label:
print('--- correct! ---')
correct_cnt[label] += 1
print(f'class_cnts, correct_cnt: {class_cnts}, {correct_cnt}')
for class_idx in range(n_classes):
confusion_vec[class_idx] = correct_cnt[class_idx] / class_cnts[class_idx]
print(f'confusion vector: {confusion_vec}')
pred, label: 0, 1 class_cnts, correct_cnt: [0, 1, 0], [0, 0, 0] pred, label: 1, 1 --- correct! --- class_cnts, correct_cnt: [0, 2, 0], [0, 1, 0] pred, label: 0, 0 --- correct! --- class_cnts, correct_cnt: [1, 2, 0], [1, 1, 0] pred, label: 2, 0 class_cnts, correct_cnt: [2, 2, 0], [1, 1, 0] pred, label: 1, 1 --- correct! --- class_cnts, correct_cnt: [2, 3, 0], [1, 2, 0] pred, label: 2, 2 --- correct! --- class_cnts, correct_cnt: [2, 3, 1], [1, 2, 1] pred, label: 0, 1 class_cnts, correct_cnt: [2, 4, 1], [1, 2, 1] confusion vector: [0.5, 0.5, 1.0]
Exercise.1 - 79: Histogram 구하기¶
For loop와 if statement를 이용하여 histogram을 구하는 방법을 배워봅니다.
scores = [50, 20, 30, 40, 10, 50, 70, 80, 90, 20, 30,]
cutoffs = [0, 2, 40, 60, 80,]
histogram = [0, 0, 0, 0, 0]
for score in scores:
if score > cutoffs[4]:
histogram[4] += 1
elif score > cutoffs[3]:
histogram[3] += 1
elif score > cutoffs[2]:
histogram[2] += 1
elif score > cutoffs[1]:
histogram[1] += 1
else:
pass
print(f'histogram of the scores: {histogram}')
histogram of the scores: [0, 6, 2, 2, 1]
Exercise.1 - 80: 절댓값 구하기¶
For loop와 if statement를 이용하여 원소들의 절댓값을 가진 list를 만드는 방법에 대해 연습해봅니다.
numbers = [-2, 2, -1, 3, -4 ,9]
abs_numbers = []
for num in numbers:
if num < 0:
abs_numbers.append(-num) # abs(num)
else:
abs_numbers.append(num)
print(f'absolute numbers: {abs_numbers}')
absolute numbers: [2, 2, 1, 3, 4, 9]
Exercise.1 - 81: Manhattan Distance¶
원소들의 절댓값을 가진 list를 만드는 방법을 이용하여 Manhattan distance를 계산하는 방법에 대해 배워봅니다.
v1 = [1, 3, 5, 2, 1, 5, 2]
v2 = [2, 3, 1, 5, 2, 1, 3]
m_distance = 0
for dim_idx in range(len(v1)):
sub = v1[dim_idx] - v2[dim_idx]
if sub < 0:
m_distance += -sub # abs(sub)
else:
m_distance += sub
print(f'Mangattan distance = {m_distance}')
Mangattan distance = 14
Exercise.1 - 82: Nested List 만들기 & 원소 접근하기¶
2차원 정보를 List로 만드는 방법과 각 원소에 접근하면 나오는 결과를 연습해봅시다.
scores = [[10, 20, 30,], [50, 60, 70]]
print(scores)
print(scores[0])
print(scores[1])
print(scores[0][0], scores[0][1], scores[0][2])
print(scores[1][0], scores[1][1], scores[1][2])
[[10, 20, 30], [50, 60, 70]] [10, 20, 30] [50, 60, 70] 10 20 30 50 60 70
Exercise.1 - 83: Nested List 원소 접근하기(2)¶
For loop을 통해 2차원 List의 원소에 접근하는 방법을 연습해봅니다.
scores = [[10, 20, 30], [50, 60, 70]]
for student_scores in scores:
print(student_scores)
for score in student_scores:
print(score)
[10, 20, 30] 10 20 30 [50, 60, 70] 50 60 70
Exercise.1 - 84: 학생별 평균점수 구하기¶
For loop을 통해 row-wise mean을 구하는 방법을 연습해봅니다.
scores = [[10, 15, 20], [20, 25, 30], [30, 35, 40], [40, 45, 50]]
n_class = len(scores[0])
student_score_mean = []
for student_scores in scores:
student_score_sum = 0
for score in student_scores:
student_score_sum += score
student_score_mean.append(student_score_sum / n_class)
print(student_score_mean)
[15.0, 25.0, 35.0, 45.0]
Exercise.1 - 85: 과목별 평균 점수 구하기¶
For loop을 통해 column-wise mean 을 구하는 방법을 연습해봅니다.
scores = [[10, 15, 20], [20, 25, 30], [30, 35, 40], [40, 45, 50]]
n_student = len(scores)
n_class = len(scores[0])
class_score_mean = []
for idx_class in range(n_class):
class_score_sum = 0
for idx_student in range(n_student):
class_score_sum += scores[idx_student][idx_class]
print(f'class score sum: {class_score_sum}')
class_score_mean.append(class_score_sum/n_student)
print(f'class score mean: {class_score_mean}')
class score sum: 100 class score sum: 120 class score sum: 140 class score mean: [25.0, 30.0, 35.0]
Exercise.1 - 86: Mean Substraction¶
scores = [[10, 15, 20], [20, 25, 30], [30, 35, 40], [40, 45, 50]]
n_student = len(scores)
n_class = len(scores[0])
class_score_mean = []
for idx_class in range(n_class):
class_score_sum = 0
for idx_student in range(n_student):
class_score_sum += scores[idx_student][idx_class]
class_score_mean.append(class_score_sum/n_student)
print(f'{idx_class+1} class score sum: {class_score_sum}')
print(f'{idx_class+1} class score mean: {class_score_mean[idx_class]}')
print(f'class score mean: {class_score_mean}')
print('--- Mean Substraction ---')
for idx_class in range(n_class):
for idx_student in range(n_student):
scores[idx_student][idx_class] -= class_score_mean[idx_class]
print(f'mean substraction scores: {scores}')
for idx_class in range(n_class):
class_score_sum = 0
for idx_student in range(n_student):
class_score_sum += scores[idx_student][idx_class]
class_score_mean[idx_class] = class_score_sum/n_student
print(f'{idx_class+1} class score sum: {class_score_sum}')
print(f'{idx_class+1} class score mean: {class_score_mean[idx_class]}')
print(f'class score mean: {class_score_mean}')
1 class score sum: 100 1 class score mean: 25.0 2 class score sum: 120 2 class score mean: 30.0 3 class score sum: 140 3 class score mean: 35.0 class score mean: [25.0, 30.0, 35.0] --- Mean Substraction --- mean substraction scores: [[-15.0, -15.0, -15.0], [-5.0, -5.0, -5.0], [5.0, 5.0, 5.0], [15.0, 15.0, 15.0]] 1 class score sum: 0.0 1 class score mean: 0.0 2 class score sum: 0.0 2 class score mean: 0.0 3 class score sum: 0.0 3 class score mean: 0.0 class score mean: [0.0, 0.0, 0.0]
Exercise.1 - 87: Variances Standard Deviation¶
분산 표준편차
scores = [[10, 15, 20], [20, 25, 30], [30, 35, 40], [40, 45, 50]]
n_student = len(scores)
n_class = len(scores[0])
print('--- mean of square ---') # 제평
mean_of_square = []
for idx_class in range(n_class):
sum_of_square = 0
for idx_student in range(n_student):
sum_of_square += scores[idx_student][idx_class] ** 2
mean_of_square.append(sum_of_square / n_student)
print(f'{idx_class+1} class mean of square: {mean_of_square[idx_class]}')
print('--- square of mean ---') # 평제
square_of_mean = []
for idx_class in range(n_class):
sum_of_class = 0
for idx_student in range(n_student):
sum_of_class += scores[idx_student][idx_class]
square_of_mean.append((sum_of_class / n_student) ** 2)
print(f'{idx_class+1} class square of mean: {square_of_mean[idx_class]}')
print('--- variance & std ---') # 제평 - 평제
variance_of_class = []
std_of_class = []
for idx_class in range(n_class):
variance_of_class.append(mean_of_square[idx_class] - square_of_mean[idx_class])
std_of_class.append(variance_of_class[idx_class] ** 0.5)
print(f'{idx_class+1} class variance: {variance_of_class[idx_class]}')
print(f'{idx_class+1} class standard deviation: {std_of_class[idx_class]}')
--- mean of square --- 1 class mean of square: 750.0 2 class mean of square: 1025.0 3 class mean of square: 1350.0 --- square of mean --- 1 class square of mean: 625.0 2 class square of mean: 900.0 3 class square of mean: 1225.0 --- variance & std --- 1 class variance: 125.0 1 class standard deviation: 11.180339887498949 2 class variance: 125.0 2 class standard deviation: 11.180339887498949 3 class variance: 125.0 3 class standard deviation: 11.180339887498949
Exercise.1 - 88: Standardization¶
표준화
scores = [[10, 15, 20], [20, 25, 30], [30, 35, 40], [40, 45, 50]]
n_student = len(scores)
n_class = len(scores[0])
print('--- mean of square ---') # 제평
mean_of_square = []
for idx_class in range(n_class):
sum_of_square = 0
for idx_student in range(n_student):
sum_of_square += scores[idx_student][idx_class] ** 2
mean_of_square.append(sum_of_square / n_student)
print(f'{idx_class+1} class mean of square: {mean_of_square[idx_class]}')
print('--- square of mean ---') # 평제
square_of_mean = []
for idx_class in range(n_class):
score_of_class = 0
for idx_student in range(n_student):
score_of_class += scores[idx_student][idx_class]
square_of_mean.append((score_of_class / n_student) ** 2)
print(f'{idx_class+1} class square of mean: {square_of_mean[idx_class]}')
print('--- variance & std ---') # 제평 - 평제
variance_of_class = []
std_of_class = []
for idx_class in range(n_class):
variance_of_class.append(mean_of_square[idx_class] - square_of_mean[idx_class])
std_of_class.append(variance_of_class[idx_class] ** 0.5)
print(f'{idx_class+1} class variance: {variance_of_class[idx_class]}')
print(f'{idx_class+1} class standard deviation: {std_of_class[idx_class]}')
print('--- Standardization ---')
for idx_class in range(n_class):
for idx_student in range(n_student):
scores[idx_student][idx_class] = (scores[idx_student][idx_class] \
- square_of_mean[idx_class]**0.5) / \
std_of_class[idx_class]
print(f'Standardization score: {scores}')
print('--- mean of square ---') # 제평
mean_of_square = []
for idx_class in range(n_class):
sum_of_square = 0
for idx_student in range(n_student):
sum_of_square += scores[idx_student][idx_class] ** 2
mean_of_square.append(sum_of_square / n_student)
print(f'{idx_class+1} class mean of square: {mean_of_square[idx_class]}')
print('--- square of mean ---') # 평제
square_of_mean = []
for idx_class in range(n_class):
score_of_class = 0
for idx_student in range(n_student):
score_of_class += scores[idx_student][idx_class]
square_of_mean.append((score_of_class / n_student) ** 2)
print(f'{idx_class+1} class square of mean: {square_of_mean[idx_class]}')
print('--- variance & std ---') # 제평 - 평제
variance_of_class = []
std_of_class = []
for idx_class in range(n_class):
variance_of_class.append(mean_of_square[idx_class] - square_of_mean[idx_class])
std_of_class.append(variance_of_class[idx_class] ** 0.5)
print(f'{idx_class+1} class variance: {variance_of_class[idx_class]}')
print(f'{idx_class+1} class standard deviation: {std_of_class[idx_class]}')
--- mean of square --- 1 class mean of square: 750.0 2 class mean of square: 1025.0 3 class mean of square: 1350.0 --- square of mean --- 1 class square of mean: 625.0 2 class square of mean: 900.0 3 class square of mean: 1225.0 --- variance & std --- 1 class variance: 125.0 1 class standard deviation: 11.180339887498949 2 class variance: 125.0 2 class standard deviation: 11.180339887498949 3 class variance: 125.0 3 class standard deviation: 11.180339887498949 --- Standardization --- Standardization score: [[-1.3416407864998738, -1.3416407864998738, -1.3416407864998738], [-0.4472135954999579, -0.4472135954999579, -0.4472135954999579], [0.4472135954999579, 0.4472135954999579, 0.4472135954999579], [1.3416407864998738, 1.3416407864998738, 1.3416407864998738]] --- mean of square --- 1 class mean of square: 1.0 2 class mean of square: 1.0 3 class mean of square: 1.0 --- square of mean --- 1 class square of mean: 0.0 2 class square of mean: 0.0 3 class square of mean: 0.0 --- variance & std --- 1 class variance: 1.0 1 class standard deviation: 1.0 2 class variance: 1.0 2 class standard deviation: 1.0 3 class variance: 1.0 3 class standard deviation: 1.0
import numpy as np
scores = [[10, 15, 20], [20, 25, 30], [30, 35, 40], [40, 45, 50]]
print(f'mean by class: {np.mean(scores, axis=0)}')
print(f'variance by class: {np.var(scores, axis=0)}')
print(f'std by class: {np.std(scores, axis=0)}')
scores = (np.array(scores) - np.mean(scores, axis=0)) / np.std(scores, axis=0)
print(f'Standardization: \n{scores}')
print(f'mean by class: {np.mean(scores, axis=0)}')
print(f'variance by class: {np.var(scores, axis=0)}')
print(f'std by class: {np.std(scores, axis=0)}')
mean by class: [25. 30. 35.] variance by class: [125. 125. 125.] std by class: [11.18033989 11.18033989 11.18033989] Standardization: [[-1.34164079 -1.34164079 -1.34164079] [-0.4472136 -0.4472136 -0.4472136 ] [ 0.4472136 0.4472136 0.4472136 ] [ 1.34164079 1.34164079 1.34164079]] mean by class: [0. 0. 0.] variance by class: [1. 1. 1.] std by class: [1. 1. 1.]
'새싹 > TIL' 카테고리의 다른 글
[핀테커스] 231004 matplot 실습 (0) | 2023.10.04 |
---|---|
[핀테커스] 230922 python 수학 실습 (0) | 2023.09.22 |
[핀테커스] 230920 python 수학 실습 (0) | 2023.09.20 |
[핀테커스] 230919 python 수학 실습 (1) | 2023.09.19 |
[핀테커스] 230918 데이터 시각화 라이브러리 실습 (1) | 2023.09.18 |