Monday, August 14, 2023

Shannon Entropy Weights Method (EWM)

WCMI Shannon Entropy Weights Method

The Shannon Entropy Weights Method (EWM) is a powerful tool for selecting an alternative using multiple criteria. It is based on the concept of entropy, which measures the uncertainty or randomness of a system. In the context of decision-making, entropy can be used to measure the uncertainty of each criterion relative to the total entropy.

Min-Max Normalization And Relative Entropy Weights

The Shannon Entropy Weights Method is a decision-making method that uses entropy to evaluate the uncertainty of multiple criteria and assign weights to each criterion. The method consists of two main steps:

Calculate Entropy Of Each Criterion

Entropy is a measure of uncertainty or randomness. In the context of decision-making, it represents the uncertainty of each criterion. The entropy of each criterion is calculated using the Shannon-Weaver formula:

H ( X ) = p ( x ) log 2 ( p ( x ) ) H(X) = - \sum p(x) \log_2(p(x))

where H ( X ) H(X) is the entropy of the criterion, p ( x ) p(x) is the probability of each possible value of the criterion, and the sum is taken over all possible values of the criterion.

Calculate Relative Entropy Weights

After calculating the entropy of each criterion, the relative entropy weights are calculated. The relative entropy weight of each criterion is calculated as one minus the ratio of its entropy to the total entropy of all criteria. The formula for calculating the relative entropy weight is:

w i = 1 H ( X i ) H ( X j ) w_i = 1 - \frac{H(X_i)}{\sum H(X_j)}

where w i w_i is the relative entropy weight of the i t h ith criterion, H ( X i ) H(X_i) is the entropy of the i t h ith criterion, and the sum is taken over all criteria.

The Min-Max Normalization and Relative Entropy Weights Method is a variation of the Shannon Entropy Weights Method that addresses the issue of some criteria having a large range of values, while others have a small range. This can lead to inaccurate weights, as the entropy of criteria with a large range of values may dominate the total entropy.

To address this issue, the Min-Max Normalization and Relative Entropy Weights Method normalizes the data for each criterion by subtracting its minimum value and then dividing by its range. The formula for normalizing data for each criterion is:

x i = x i min ( x i ) max ( x i ) min ( x i ) x_i^* = \frac{x_i - \min(x_i)}{\max(x_i) - \min(x_i)}

where x i x_i^* is the normalized value for alternative i i , x i x_i is its original value, min ( x i ) \min(x_i) is its minimum value, and max ( x i ) \max(x_i) is its maximum value.

After normalizing data for each criterion, entropies are calculated using normalized data and relative entropy weights are calculated using normalized entropies:

w i = 1 H ( X i ) H ( X j ) w_i = 1 - \frac{H(X_i^*)}{\sum H(X_j^*)}

The final weights are then calculated by normalizing relative entropy weights so that they add up to one:

w i , final = w i w j w_{i,\text{final}} = \frac{w_i}{\sum w_j}

The Min-Max Normalization and Relative Entropy Weights Method provides a more accurate way of calculating weights for criteria with different ranges of values. It ensures that weights are based on relative uncertainty of each criterion, taking into account the range of values for each criterion.

One way to implement the Shannon Entropy Weights Method is by using the Min-Max Normalization and Relative Entropy Weights Method. This approach involves two main steps:

  • Min-Max Normalization: This step involves normalizing the data for each criterion using the min-max normalization formula, which scales the data to a range from zero to one. This is important because it ensures that all criteria are on the same scale and can be compared directly.

  • Relative Entropy Weights Method: This step involves calculating the entropy for each criterion and then using the relative entropy formula to calculate the weights for each criterion. The relative entropy formula calculates the weight for each criterion as one minus the ratio of the entropy of that criterion to the total entropy of all criteria.

In the Python script below, these two steps are implemented in the 'normalize_data' and 'calculate_weights' functions, respectively.

Alternatives And Criteria

The Python script addresses the issues of alternatives, criteria, and lower values are better issues, as follows:

  • Number of alternatives: The script reads data from a CSV file, where each row represents an alternative. The number of alternatives is determined by the number of rows in the CSV file. In the sample CSV file, there are seven alternatives: Alpha, Bravo, Charlie, Delta, Echo, Foxtrot, and Golf.

  • Number of criteria: The script reads data from a CSV file, where each column (except for the first column) represents a criterion. The number of criteria is determined by the number of columns in the CSV file (minus one for the first column, which contains the names of the alternatives). In the sample CSV file, there are nine criteria: E, R, Y, F, Z, W, D, L, and P.

  • Criteria where lower value is better: The script takes as input a list of criteria where a lower value is considered better. This is specified in the 'lower_is_better' variable. The snippet, 'lower_is_better' = ['R', 'Y', 'L', 'P'], means that for criteria R, Y, L, and P, a lower value is considered better. The script uses this information to apply constraints to the data by negating the values for these criteria.

EWM Script (Illustrative)

Our sample script is merely a starting point for illustrating the use of EVM in a sports (AvK) handicapping scenario. It is, by no means, 'production-ready' (limited testing and little or no error handling). Use it to generate an automatic baseline for evaluating competitors in a sports event across multiple criteria of your own choosing.

# -*- coding: latin-1 -*-

__author__ = 'matekus'
__version__ = 'v1.02'
__copyright__ = '(c) 2023, ' + __author__ + ', All Rights Reserved.'
__address__ = '[ https://vendire-ludorum.blogspot.com/ ]'

import pandas as pd
import numpy as np
from functools import reduce

def read_data(state: dict) -> dict:
    """Read data from CSV file.
    Args:
        state: The state dictionary.
    Returns:
        The updated state dictionary with the data loaded from the CSV file.
    """
    # Read data from CSV file
    data = pd.read_csv(state['file_path'], index_col=0)
    state['data'] = data
    return state


def apply_constraints(state: dict) -> dict:
    """Apply constraints.
    Args:
        state: The state dictionary.
    Returns:
        The updated state dictionary with the constraints applied.
    """
    # Apply constraints
    lower_is_better = state['lower_is_better']
    for col in lower_is_better:
        state['data'][col] = state['data'][col].apply(lambda x: -x)
    return state


def normalize_data(state: dict) -> dict:
    """Normalize data.
    Args:
        state: The state dictionary.
    Returns:
        The updated state dictionary with the data normalized.
    """
    # Normalize data
    data = state['data'].apply(lambda x: (x - np.min(x)) / (np.max(x) - np.min(x)))
    state['data'] = data
    return state


def calculate_entropy(state: dict) -> dict:
    """Calculate entropy for each criterion.
    Args:
        state: The state dictionary.
    Returns:
        The updated state dictionary with the entropy for each criterion.
    """
    # Calculate entropy for each criterion
    entropy = state['data'].apply(lambda x: -np.sum(x * np.log(x + 1e-6)))
    state['entropy'] = entropy
    return state


def calculate_weights(state: dict) -> dict:
    """Calculate weights for each criterion.
    Args:
        state: The state dictionary.
    Returns:
        The updated state dictionary with the weights for each criterion.
    """
    # Calculate weights for each criterion
    weights = 1 - state['entropy'] / np.sum(state['entropy'])
    state['weights'] = weights
    return state


def calculate_scores(state: dict) -> dict:
    """Calculate final scores for each alternative.
    Args:
        state: The state dictionary.
    Returns:
        The updated state dictionary with the final scores for each alternative.
    """
    # Calculate final scores for each alternative
    scores = state['data'].dot(state['weights'])
    state['scores'] = scores
    return state


def normalize_scores(state: dict) -> dict:
    """Normalize scores to add up to 1.00.
    Args:
        state: The state dictionary.
    Returns:
        The updated state dictionary with the scores normalized.
    """
    # Normalize scores to add up to 1.00
    scores = state['scores'] / np.sum(state['scores'])
    state['scores'] = scores
    return state


def sort_scores(state: dict) -> dict:
    """Sort scores in descending order.
    Args:
        state: The state dictionary.
    Returns:
        The updated state dictionary with the scores sorted.
    """
    # Sort scores in descending order
    state['scores'] = state['scores'].sort_values(ascending=False)
    return state


# Define pipeline
pipeline = [
    read_data,
    apply_constraints,
    normalize_data,
    calculate_entropy,
    calculate_weights,
    calculate_scores,
    normalize_scores,
    sort_scores
]


def main():
    """The main function.
    Args:
        None
    Returns:
        None
    """

    print('')
    print('Shannon Entropy Weights Method.')
    print(__copyright__)
    print(__address__)
    print('')

    # Initialize state
    state = {
        'file_path': "C:\\data\\Shannon_Entropy_Weights_Blog_Example.csv",
        'lower_is_better': ['R', 'Y', 'L', 'P']
    }

    # Run pipeline
    final_state = reduce(lambda v, f: f(v), pipeline, state)

    # Format output as table
    scores = final_state['scores']
    formatted_scores = scores.to_frame('Score')
    formatted_scores.index.name = 'Alternative'
    formatted_scores = formatted_scores.reset_index()
    formatted_scores = formatted_scores.to_string(index=False, formatters={'Alternative': '{:<11}'.format, 'Score': '{:>10.6f}'.format})
    print(formatted_scores)

    print('')
    print('Fini!')
    print('')


if __name__ == "__main__":
    main()

Input CSV (Illustrative)

Our sample input CSV file a nominal sports event with seven competitors and nine criteria. For criteria 'R', 'Y', 'L', and 'P' lower values are better while for the other five criteria higher values are better.

Alternative E R Y F Z W D L P
Alpha 2493 3 3 0.92 9.77 1.03 41 131 3.25
Bravo 2110 4 4 0.78 9.32 1.17 7 128 34.00
Charlie 2395 3 3 0.89 10.21 1.06 53 131 6.00
Delta 2318 5 5 0.75 10.02 1.15 27 131 13.00
Echo 1938 6 6 0.60 8.97 1.03 5 128 101.00
Foxtrot 2622 3 3 0.97 10.47 1.06 50 128 3.50
Golf 2332 3 3 0.79 9.48 1.15 41 131 3.75

Output Scores (Illustrative)

The script outputs the competitors in descending order of scores.

Shannon Entropy Weights Method.
(c) 2023, matekus, All Rights Reserved.
[ https://vendire-ludorum.blogspot.com/ ]  

Alternative      Score
Foxtrot       0.218816
Charlie       0.171725
Golf          0.161934
Alpha         0.158567
Bravo         0.138563
Delta         0.120214
Echo          0.030181  

Fini!

Enjoy!