In this blog post I model a support vector machine in Python. Previously, I modeled and solved the quadratic assignment problem (QAP) in Python using Pyomo (+). I described that the “similarity” of two facilities could be a reason for putting them closer to each other when assigning them to a set of pre-determined positions. Now consider that we have a set of (new) data points (test dataset) to assign them to a set of classes. Undoubtedly, those with similar features should go into the same class. However, this time, the difference is that we want to build a machine intelligent enough to identify those similarities automatically. To detect such similarities it uses support vectors.
In this article, I model the intelligence of such a machine (i.e. a support vector machine) using mathematical relations and optimize its intelligence via Gekko, an optimization interface in Python with large-scale non-linear solvers.
Modeling and solving the inherent optimization problem in Python
Herein, I code the decision problem for creating a support vector machine (SVM):
import gekko as op import itertools as it #Developer: @KeivanTafakkori, 22 April 2022 def model (U,T,a,b,solve="y"): m = op.GEKKO(remote=False, name='SupportVectorMachine') alpha = {t: m.Var(lb=0, ub=None) for t in T} n_a = {(t,i): a[t][i] for t,i in it.product(T,U)} n_b = {t: b[t] for t in T} objs = {0: sum(alpha[t] for t in T) - sum(alpha[t]*alpha[tt] * n_b[t]*n_b[tt] * sum(n_a[(t,i)]*n_a[(tt,i)] for i in U) for t,tt in it.product(T,T))} cons = {0: {0: ( sum(alpha[t]*n_b[t] for t in T) == 0) for t in T}} m.Maximize(objs[0]) for keys1 in cons: for keys2 in cons[keys1]: m.Equation(cons[keys1][keys2]) if solve == "y": m.options.SOLVER=1 m.solve(disp=True) for keys in alpha: alpha[keys] = alpha[keys].value[0] print(f"alpha[{keys}]", alpha[keys]) x = [None for i in U] for i in U: x[i]=sum(alpha[t]*b[t]*n_a[(t,i)] for t in T) for t in T: if alpha[t]>0: z=b[t] - sum(x[i]*n_a[(t,i)] for i in U) break return m,x,z,alpha
The decision here is to find values of alpha (support vectors). Note that the optimization model coded here is the dual form of the main optimization problem for SVMs. The dual form of the model does not need to have constraints over observed data. Moreover, it can create non-linear boundaries for classes using kernel tricks. Such characteristics make it more suitable for deriving better boundaries and for datasets with higher dimensions (without a need for dimensionality reduction!), e.g., images, while making it more computationally efficient. Accordingly, an SVM can perform better than complex deep neural networks.
Next, when training the machine is completed, the stored values for alpha and z can be used to make predictions using the following prediction function:
def classify(dataset,x,z,alpha,a): if sum(sum(alpha[t]*dataset[1][t]*dataset[0][t][i] for t in T)*a[i] for i in U) + z > 0: return 1 else: return -1
For instance, consider the following dataset:
# EXP1 EXP2 EXP3 EXP4 EXP5 a = [[1,2,2],[2,3,3],[3,4,5],[4,5,6],[5,7,8]] #Training Dataset (inputs) b = [ -1 , 1 , -1 , 1 , -1 ] #Training Dataset (outputs) U = range(len(a[0])) #Set of input features T = range(len(b)) #Set of the training points
To make predictions, at first we model and solve the mentioned optimization problem:
m, x, z, alpha = model(U,T,a,b) #Model and solve the problem
The results are as follows:
alpha[0] 7.8112756344 alpha[1] 9.8112756622 alpha[2] 2.1887244438 alpha[3] 3.1887243395 alpha[4] 2.9999999235
Then, we implement the prediction function:
print(classify([a,b],x,z,alpha,[1,2,2])) #Predict the output (100% Accurate :)!
Fortunately, the result is 100% accurate 🙂 and the output is as follows:
-1
That’s all! We have successfully built our support vector machine without using extra packages but considering the optimization logic behind it!
Concluding remarks on supper vector machines in Python
In this article, I showed how a classifier, called a support vector machine, can be modeled using an optimization model, then solved to optimality to characterize a prediction function, which can accurately label new observations. In the following articles, I will try to discuss kernels and how to use them effectively.
If this article is going to be used in research or other publishing methods, you can cite it as Tafakkori (2022) (in text) and refer to it as follows: Tafakkori, K. (2022). Creating a support vector machine using Gekko in Python. Supply Chain Data Analytics. url: https://www.supplychaindataanalytics.com/creating-a-support-vector-machine-using-gekko-in-python/
Industrial engineer focusing on leveraging optimization methods and artificial intelligence technologies using multiple programming languages to empower a business to achieve its goals!
Leave a Reply