Here you can find material to start quickly developing practical applications using SolveDB. A set of assignments for practicing your SolveDB skills can be found here.

## SolveDB: Basic Syntax

SolveDB extends the standard PostgreSQL with the new **SOLVESELECT** clause for optimization problems.

If your decision variables fit a single table, use the following **SOLVESELECT** syntax:

SOLVESELECTcol_name [, ...]IN[ aliasAS] ( select_stmt ) [MINIMIZE( select_stmt ) [MAXIMIZE( select_stmt ) ] |MAXIMIZE( select_stmt ) [MINIMIZE( select_stmt ) ] ] [SUBJECTTO( select_stmt ) [, ...] ] [USINGsolver_name [. ...] [( param[:= expr] [, ...] )] ]

e.g.,

SOLVESELECTaINtAS(SELECTNULL::float8ASa )MINIMIZE(SELECTaFROMt)SUBJECTTO(SELECTa>=1FROMt )USINGsolverlp

Alternatively, the decision columns can be specified in the more compact form:

SOLVESELECT[ alias ( col_name [, ...] )AS] ( select_stmt ) ...

where *col_name* can be set to * (asterisk) to denote that ALL columns in the input table are decision columns, e.g.,

SOLVESELECTt(*)AS(SELECTNULL::float8ASa ) ...

If you decision variables best fit several tables, use the **WITH** clause within **SOLVESELECT**:

WITH[ col_name [, ...]IN] aliasAS( select_stmt ) [,...]]

e.g.,

SOLVESELECTaINtAS(SELECTNULL::float8ASa )WITHbINkAS(SELECTNULL::float8ASb ), jAS(SELECT10.0ASc )MINIMIZE(SELECTaFROMt)SUBJECTTO(SELECTa >= 1 + c, b=5FROMt, k, j )USINGsolverlp

In this example, k and j are also known as *common decision table expressions (CDTEs). *Unlike k, j does not define any decision variables and therefore j is treated like the standard common table expression (CTE).

## SolveDB: Solvers

All optimization problems, including *linear programming* (LP), *mixed-integer programming* (MIP), *derivative-free non-linear* (NPL), and domain-specific problems, are specified using the same **SOLVESELECT** clause.

The choice of the solver (in the **USING** clause) determines the interpretation of the **MAXIMIZE/MINIMIZE** and the **SUBJECTTO **SQL (SELECT) expressions.

There are currently two general-purpose so-called *view solvers *available in SolveDB:

**solverlp**– a solver for LP/MIP optimization problems**swarmops**– a solver for black-box global optimization problems

Each of these may use one of several built-in (low-level) *physical solvers *(e.g., *GLPK*,* CBC*, and *simulated annealing*), i.e., methods for actual computations of solving the problem.

You can get the full list of installed solvers by executing **SOLVESELECT **with some non-existent random solver name, e.g.,

SOLVESELECT(SELECT1)USINGx

You can get the full list of supported *physical solvers and *additional solver information by passing the parameter *help* to the *view solver, e.g., *

SOLVESELECT(SELECT1)USINGsolverlp(help)

To get the solver output (log), i.e., solving details, use the *log_level:=0 *parameter.

...USINGsolverlp(log_level:=0)

**LP/MIP solver “solverlp”**

This view solver supports the following basic linear operators and aggregation functions in the objective and constraints expressions (**MAXIMIZE/MINIMIZE/SUBJECTTO**):

- a { +, – } b – where
*a,b*– are decision variables (columns) or expressions - a { +, -, * } b – where
*a*is a decision variable (column) or expressions, and*b*is a scalar - sum(a) – where a is a decision variable (column) or an expression

In **SUBJECTTO** expressions, the following constraint operators are supported:

- a { <=, =, >=, != } b – where
*a*– is a decision variable (column)/expression, and b is either a decision variable (column)/expression or a scalar.

e.g.,

SOLVESELECTt(x)AS(SELECTNULL::float8ASxFROMgenerate_series(1, 10)ASi)MINIMIZE(SELECTsum(x)FROMt )SUBJECTTO(SELECTx>=0FROMt ), (SELECTsum(x) >= 1FROMt )USINGsolverlp.auto()

The solver automatically detects integer-type decision variables (columns) and transparently solves the problem as MIP instead of LP. The solver can be forced to solve the MIP problem as an LP equivalent by using the physical solver “solverlp.basic”. Other physical solver (method) alternatives are possible:

**auto**(default) – automatically picks the most suitable physical solver. Often prefers GLPK LP or GLPK MIP depending on the presence of the integer-value variables (columns).

**basic**– uses the LP physical solver from the GLPK package.

**mip**– forces the MIP physical solver from the GLPK package.**cbc**– uses a physical solver from the CBC solver package. Use this for large-scale MIP problems. It often performs better than GLPK MIP for such problem.

**The blackbox solver “swarmops”**

The solver enforces no restrictions in the objective function specification (MINIMIZE/MAXIMIZE), therefore any SELECT expression can be used to define the objective function (**MINIMIZE**/**MAXIMIZE**).

In **SUBJECTTO** expressions, the solver only allows bounding operators applicable to the decision variables (columns):

- a { <=, >=, = } c – where
*a*is a decision variable (column) and*c*is a scalar.

* e.g.,
*

SOLVESELECTt(x)AS(SELECTsin(i)ASy, NULL::float8ASxFROMgenerate_series(1, 10)ASi)MINIMIZE(SELECTsum(abs(y - x))FROMt )SUBJECTTO(SELECT-1<=x, x<=1FROMt )USINGswarmops.sa(n:=10000)

You can choose from 15+ different global optimization techniques (physical solvers) based on different solving heuristics, e.g., *simulated annealing (SA)*, *particle swarm optimization (PSO). *Typically, *SA *and PSO do the good job a variety of problems.

Each of these techniques can further be tuned by changing solver parameters – see above on how these can be retrieved and set. Among those, the parameter *n* controls the number of solving iterations to execute (see the above example) – and therefore controls the trade-off between solving time and the solution quality.

## LP/MIP modeling tricks

There is a number of modeling tricks to be used in typical real-world optimization applications.

##### Implementing “all different” constraint (MIP)

Suppose there are 10 integer-valued variables: [math]x_1, x_2, …, x_{10}[/math] and you wish to assign distinct values in the range 1..10 to these variables. This problem can be modeled by factorizing all possible value combinations of these variables (using auxiliary variables) and then using *sum *constraints, e.g.,

SELECTi, xFROM(SOLVESELECTt(b)AS(SELECTi, x, NULL::booleanASbFROMgenerate_series(1, 10) AS i, -- Generate 10 variables generate_series(1, 10) AS x -- Generate 10 possible value combinations for each variable )SUBJECTTO-- Ensures that only 1 value is assigned to X(SELECTsum(b)=1FROMtGROUP BYi), -- Ensures that distinct values are assigned to X'es (SELECTsum(b)=1FROMtGROUP BYx)USINGsolverlp)ASsWHEREb

##### LP: Minimizing absolute deviations

Suppose you want to minimize the sum of absolute deviations:

[math]\text{Minimize} \sum_{i=1}^n |y_i – a_0 – a_1x_{i1} – a_2x_{i2} – \cdots – a_kx_{ik}|[/math]

with respect to the choice of the values of the parameters *a _{0}, …, a_{k}*, where

*y*is the value of the

_{i}*i*

^{th}observation of the dependent variable, and

*x*

_{ij}is the value of the

*i*

^{th}observation of the

*j*

^{th}independent variable (

*j*= 1,…,

*k*).

This problem can be rewritten in terms of auxiliary variables *u _{i }*as:

[math]\text{Minimize} \sum_{i=1}^n u_i[/math]

with respect to *a _{0}, …, a_{k}* and

*u*, subject to

_{0}, …, u_{n}[math] -1 u_i <= y_i-a_0-a_1x_{i1}-a_2x_{i2} – \cdots – a_{k}x_{ik} <= u_i[/math] for i=1,…,n.

SOLVESELECTt(a1, a2)AS(SELECTNULL::float8ASa1, NULL::float8ASa2)WITHk(u)AS(SELECTx1, x2, 123*x1+12*x2ASy, NULL::float8ASuFROMgenerate_series(1, 5)ASx1, generate_series(1, 5)ASx2)MINIMIZE(SELECTsum(u)FROMk)SUBJECTTO(SELECT-1*u <= ( y - ( a1*x1 + a2*x2) ), ( y - ( a1*x1 + a2*x2) ) <= uFROMt,k)USINGsolverlp

##### LP: Absolute values in the objective function

Consider the following optimization problem:

[math]\text{Minimize} \sum_{i=1}^n c_i |x_i| \, \, c_i > 0 \\

\text{Subject to} \sum_{i=1}^n a_{ij} x_i < ( > ) b_i \, \, \text{for} i = 1..n [/math]

with respect to *x _{1 , }*

*x*, …,

_{2}*x*

_{n.}The absolute values can be avoided by replacing each *x _{i }*and |

*x*| as follows:

_{i}[math]x_i = x^+_i – x^-_i \\

|x_i| = x^+_i + x^-_i \\

x^+_i >=0, \, x^-_i >=0 [/math]

SOLVESELECTt(xp, xn)AS(SELECTNULL::float8ASxp, NULL::float8ASxn, 10.0ASc)MINIMIZE(SELECTc * (xp + xn)FROMt )SUBJECTTO(SELECTxp>=0, xn>=0FROMt), (SELECT12 * (xp - xn) >= 120FROMt)USINGsolverlp

##### More tricks: https://download.aimms.com/aimms/download/manuals/AIMMS3OM_LinearProgrammingTricks.pdf

## Shared Models – for PA applications

SolveDB offers support for *decision modes* that are composed of basic reusable *simulation* and *value *models, e.g., in Prescriptive Analytics (PA) applications.

Such simulation models are created using the **SOLVEMODEL **construct adopting the same syntax as **SOLVESELECT. **For example, consider the following model of an ideal battery in discrete time:

CREATEORREPLACEVIEWmodelsASSELECT(SOLVEMODELparamsAS(SELECT0.0ASc_min, 12.0ASc_max, 0.0ASp_min, 1.5ASp_max)WITHstate0AS(SELECT0.0::float8ASc), inputAS(SELECTt, 0.5ASpFROMgenerate_series(1, 10)ASt), outputAS(WITHRECURSIVEr(t, c)AS(SELECT0.0::float8, cFROMstate0UNION ALLSELECTr.t+1, c + pFROMr, inputWHEREr.t + 1 = input.t)SELECT*FROMr)SUBJECTTO(SELECTp_min <= p <= p_maxFROMparams, input), (SELECTc_min <= c <= c_maxFROMparams, output) )ASbattery

The **SOLVEMODEL **specifies the modeled system (battery) as inter-linked virtual tables (CDTEs) for *parameters*, initial state, input *data,* and *outputs* (with computations inside) and with associated constraints (**SUBJECTTO**). Such models may be generated (selected) using standard SELECT queries, and then be stored in database tables (using INSERT queries) or retrieved using database views (like in this example).

For analysis and verification of the model, the **MODELEVAL **command is used:

MODELEVAL(SELECT*FROMoutput)IN(SELECTbatteryFROMmodels)

This command evaluates the 1st **SELECT** query in the context of the model, which is retrieved using the 2nd **SELECT** statement. Under the hood, the retrieved model is first converted into a standard CTE (WITH query), which is then combined with the user-specified SELECT query (the 1st SELECT) and evaluated like any standard database query. Note, the **SUBJECTTO** constraints are ignored while using **MODELEVAL**.

By using standard SQL queries, the model may be *modified *(i.e., instantiated with new data or constraints) at runtime during the query evaluation. For this, the specialized operator << is used, e.g.,

SELECTbattery << (SOLVEMODELinputAS(SELECTt, pFROMinput_data))FROMmodels

The first argument of this operator is the* initial model* and the second operator is the *“dummy” model* which specifies all new data / computations (CDTEs) and constraints (**SUBJECTTO**) to be imported into the initial model. Note, the names of CDTEs and names of attributes inside there CDTEs must mach those from the initial model – to be able to override the data / computations and the constraints. If the names do not match those in the CDTEs, they will be instead added to the original model.

Such shared models may be included into “larger problems” using the** INLINE** construct inside **SOLVESELECT **(and the << operator), e.g.,

SOLVESELECTt(p)AS(SELECTt, NULL::float8ASp, 15 - tASpriceFROMgenerate_series(1, 10)ASt)INLINEmAS(SELECTm << (SOLVEMODELinputAS(SELECTt, pFROMt)WITHstate0(c)AS(SELECTNULL::float8ASc))FROMbattery)MINIMIZE(SELECTsum(p * price)FROMt)SUBJECTTO(SELECTc=0FROMm_state0), (SELECTc=5FROMm_outputWHEREt=5)USINGsolverlp;

This defines a problem where the model *m* is imported from the database view *battery. *Before import, the decision variables (p) from the outer problem are injected into the model (using <<) to declare that the battery model inputs and the initial state are decision variables (to be optimized). In the outer problem (**MINIMIZE **/ **SUBJECTTO**), the CDTEs from the inner problem are accessible using the “m_” prefix, e.g., “m_output”, which allows accessing and using the components of the inner problem to define the outer problem.