SolveDB Cheat-Sheet

Here you can find material to start quickly developing practical applications using SolveDB. A set of assignments for practicing your SolveDB skills can be found here.


SolveDB: Basic Syntax

SolveDB extends the standard PostgreSQL with the new SOLVESELECT clause for optimization problems.

If your decision variables fit a single table, use the following SOLVESELECT syntax:

  SOLVESELECT  col_name [, ...] IN [ alias AS ] ( select_stmt )
[ MINIMIZE ( select_stmt ) [ MAXIMIZE ( select_stmt ) ] |
  MAXIMIZE ( select_stmt ) [ MINIMIZE ( select_stmt ) ] ]
[ SUBJECTTO ( select_stmt ) [, ...] ]
[ USING solver_name [. ...] [( param[:= expr] [, ...] )] ]

e.g.,

  SOLVESELECT a IN t AS ( SELECT NULL::float8 AS a  )
  MINIMIZE  ( SELECT a    FROM t)
  SUBJECTTO ( SELECT a>=1 FROM t )
  USING solverlp

Alternatively, the decision columns can be specified in the more compact form:

  SOLVESELECT  [ alias ( col_name [, ...] ) AS ] ( select_stmt )  ...

where col_name can be set to * (asterisk) to denote that ALL columns in the input table are decision columns, e.g.,

  SOLVESELECT  t(*) AS ( SELECT NULL::float8 AS a )  ...

If you decision variables best fit several tables, use the WITH clause within SOLVESELECT:

  WITH [ col_name [, ...] IN ] alias AS ( select_stmt ) [,...]]

e.g.,

  SOLVESELECT a IN t AS ( SELECT NULL::float8 AS a )
         WITH b IN k AS ( SELECT NULL::float8 AS b ),
                   j AS ( SELECT 10.0 AS c ) 
  MINIMIZE  ( SELECT a FROM t)
  SUBJECTTO ( SELECT a >= 1 + c, b=5 FROM t, k, j )
  USING solverlp

In this example, k and j are also known as common decision table expressions (CDTEs). Unlike k, j does not define any decision variables and therefore j is treated like the standard common table expression (CTE).

SolveDB: Solvers

All optimization problems, including linear programming (LP), mixed-integer programming (MIP), derivative-free non-linear (NPL), and domain-specific problems, are specified using the same SOLVESELECT clause.

The choice of the solver (in the USING clause) determines the interpretation of the MAXIMIZE/MINIMIZE and the SUBJECTTO SQL (SELECT) expressions.

There are currently two general-purpose so-called view solvers available in SolveDB:

  • solverlp –  a solver for LP/MIP optimization problems
  • swarmops – a solver for black-box global optimization problems

Each of these may use one of several built-in (low-level) physical solvers (e.g., GLPK, CBC, and simulated annealing), i.e., methods for actual computations of solving the problem.

You can get the full list of installed solvers by executing SOLVESELECT with some non-existent random solver name, e.g.,

   SOLVESELECT (SELECT 1) USING x

You can get the full list of supported physical solvers and additional solver information by passing the parameter help to the view solver, e.g.,

   SOLVESELECT (SELECT 1) USING solverlp(help)

To get the solver output (log), i.e., solving details, use the log_level:=0 parameter.

   ... USING solverlp(log_level:=0)
LP/MIP solver “solverlp”

This view solver supports the following basic linear operators and aggregation functions in the objective and constraints expressions (MAXIMIZE/MINIMIZE/SUBJECTTO):

  • a { +, – } b – where a,b – are decision variables (columns) or expressions
  • a { +, -, * } b – where a is a decision variable (column) or expressions, and b is a scalar
  • sum(a) – where a is a decision variable (column) or an expression

In SUBJECTTO expressions, the following constraint operators are supported:

  • a { <=, =, >=, != }  b – where a – is a decision variable (column)/expression, and b is either a decision variable (column)/expression or a scalar.

e.g.,

   SOLVESELECT t(x) AS (SELECT NULL::float8 AS x 
                        FROM generate_series(1, 10) AS i) 
   MINIMIZE  ( SELECT sum(x) FROM t )
   SUBJECTTO ( SELECT x>=0 FROM t ),
             ( SELECT sum(x) >= 1 FROM t )
   USING solverlp.auto()

The solver automatically detects integer-type decision variables (columns) and transparently solves the problem as MIP instead of LP. The solver can be forced to solve the MIP problem as an LP equivalent by using the physical solver “solverlp.basic”. Other physical solver (method) alternatives are possible:

  • auto (default) – automatically picks the most suitable physical solver. Often prefers GLPK LP or GLPK MIP depending on the presence of the integer-value variables (columns).
  • basic – uses the LP physical solver from the GLPK package.
  • mip – forces the MIP physical solver from the GLPK package.
  • cbc – uses a physical solver from the CBC solver package. Use this for large-scale MIP problems. It often performs better than GLPK MIP for such problem.
The blackbox solver “swarmops”

The solver enforces no restrictions in the objective function specification (MINIMIZE/MAXIMIZE), therefore any SELECT expression can be used to define the objective function (MINIMIZE/MAXIMIZE).

In SUBJECTTO expressions, the solver only allows bounding operators applicable to the decision variables (columns):

  • a { <=, >=, = } c – where a is a decision variable (column) and c is a scalar.

 e.g.,

   SOLVESELECT t(x) AS (SELECT sin(i) AS y, NULL::float8 AS x 
                        FROM generate_series(1, 10) AS i) 
   MINIMIZE  ( SELECT sum(abs(y - x)) FROM t )
   SUBJECTTO ( SELECT -1<=x, x<=1 FROM t )
   USING swarmops.sa(n:=10000)

You can choose from 15+ different global optimization techniques (physical solvers) based on different solving heuristics, e.g., simulated annealing (SA), particle swarm optimization (PSO). Typically, SA and PSO do the good job a variety of problems.

Each of these techniques can further be tuned by changing solver parameters  – see above on how these can be retrieved and set. Among those, the parameter n controls the number of solving iterations to execute (see the above example) – and therefore controls the trade-off between solving time and the solution quality.

LP/MIP modeling tricks

There is a number of modeling tricks to be used in typical real-world optimization applications.

Implementing “all different” constraint (MIP)

Suppose there are 10 integer-valued variables: [math]x_1, x_2, …, x_{10}[/math] and you wish to assign distinct values in the range 1..10 to these variables. This problem can be modeled by factorizing all possible value combinations of these variables (using auxiliary variables) and then using sum constraints, e.g.,

  SELECT i, x FROM (
    SOLVESELECT t(b) AS (SELECT i, x, NULL::boolean AS b 
                         FROM generate_series(1, 10) AS i, -- Generate 10 variables
                              generate_series(1, 10) AS x  -- Generate 10 possible value combinations for each variable
                        )
    SUBJECTTO 
       -- Ensures that only 1 value is assigned to X
      (SELECT sum(b)=1 FROM t GROUP BY i),
       -- Ensures that distinct values are assigned to X'es
      (SELECT sum(b)=1 FROM t GROUP BY x)
    USING solverlp) AS s
  WHERE b
LP: Minimizing absolute deviations

Suppose you want to minimize the sum of absolute deviations:

[math]\text{Minimize} \sum_{i=1}^n |y_i – a_0 – a_1x_{i1} – a_2x_{i2} – \cdots – a_kx_{ik}|[/math]

with respect to the choice of the values of the parameters a0, …, ak, where yi is the value of the ith observation of the dependent variable, and xij is the value of the ith observation of the jth independent variable (j = 1,…,k).

This problem can be rewritten in terms of auxiliary variables ui as:

[math]\text{Minimize} \sum_{i=1}^n u_i[/math]

with respect to a0, …, ak and u0, …, un, subject to

[math] -1 u_i <= y_i-a_0-a_1x_{i1}-a_2x_{i2} – \cdots – a_{k}x_{ik} <= u_i[/math] for i=1,…,n.

SOLVESELECT t(a1, a2) AS (SELECT NULL::float8 AS a1, NULL::float8 AS a2)
 WITH k(u) AS (SELECT x1, x2, 123*x1+12*x2 AS y, NULL::float8 AS u
               FROM generate_series(1, 5) AS x1, 
                    generate_series(1, 5) AS x2)
MINIMIZE   ( SELECT sum(u) FROM k)
SUBJECTTO  ( SELECT -1*u <= ( y - ( a1*x1 + a2*x2) ), 
                            ( y - ( a1*x1 + a2*x2) ) <= u FROM t,k) 
USING solverlp
LP: Absolute values in the objective function

Consider the following optimization problem:

[math]\text{Minimize} \sum_{i=1}^n c_i |x_i| \, \, c_i > 0 \\

\text{Subject to} \sum_{i=1}^n a_{ij} x_i < ( > ) b_i \, \, \text{for} i = 1..n [/math]

with respect to x1 , x2, …, xn.

The absolute values can be avoided by replacing each xi and |xi| as follows:

[math]x_i = x^+_i – x^-_i \\

|x_i| = x^+_i + x^-_i \\

x^+_i >=0, \, x^-_i >=0 [/math]

SOLVESELECT t(xp, xn) AS (SELECT NULL::float8 AS xp, NULL::float8 AS xn, 10.0 AS c)
MINIMIZE  ( SELECT c * (xp + xn) FROM t )
SUBJECTTO ( SELECT xp>=0, xn>=0 FROM t),
          ( SELECT 12 * (xp - xn) >= 120 FROM t)
USING solverlp
 More tricks: https://download.aimms.com/aimms/download/manuals/AIMMS3OM_LinearProgrammingTricks.pdf

Shared Models – for PA applications

SolveDB offers support for decision modes that are composed of basic reusable simulation and value models, e.g., in Prescriptive Analytics (PA) applications.

Such simulation models are created using the SOLVEMODEL construct adopting the same syntax as SOLVESELECT. For example, consider the following model of an ideal battery in discrete time:

CREATE OR REPLACE VIEW models AS
 SELECT (SOLVEMODEL params AS (SELECT 0.0 AS c_min, 12.0 AS c_max, 0.0 AS p_min, 1.5 AS p_max) 
               WITH state0 AS (SELECT 0.0::float8 AS c),
                     input AS (SELECT t, 0.5 AS p FROM generate_series(1, 10) AS t),
                    output AS (WITH RECURSIVE r(t, c) AS (
                                  SELECT 0.0::float8, c FROM state0
                                 UNION ALL
                                  SELECT r.t+1, c + p
                                  FROM r, input
                                  WHERE r.t + 1 = input.t)
                               SELECT * FROM r)
         SUBJECTTO ( SELECT p_min <= p <= p_max FROM params, input),
                   ( SELECT c_min <= c <= c_max FROM params, output) 
        ) AS battery

The SOLVEMODEL specifies the modeled system (battery) as inter-linked virtual tables (CDTEs) for parameters, initial state, input data, and outputs (with computations inside) and with associated constraints (SUBJECTTO). Such models may be generated (selected) using standard SELECT queries, and then be stored in database tables (using INSERT queries) or retrieved using database views (like in this example).

For analysis and verification of the model, the MODELEVAL command is used:

MODELEVAL (SELECT * FROM output) IN (SELECT battery FROM models)

This command evaluates the 1st SELECT query in the context of the model, which is retrieved using the 2nd SELECT statement. Under the hood, the retrieved model is first converted into a standard CTE (WITH query), which is then combined with the user-specified SELECT query (the 1st SELECT) and evaluated like any standard database query. Note, the SUBJECTTO constraints are ignored while using MODELEVAL.

By using standard SQL queries, the model may be modified (i.e., instantiated with new data or constraints) at runtime during the query evaluation. For this, the specialized operator << is used, e.g.,

SELECT battery << (SOLVEMODEL input AS (SELECT t, p FROM input_data))
FROM models

The first argument of this operator is the initial model and the second operator is the “dummy” model which specifies all new data / computations (CDTEs) and constraints (SUBJECTTO) to be imported into the initial model. Note, the names of CDTEs and names of attributes inside there CDTEs must mach those from the initial model – to be able to override the data / computations and the constraints. If the names do not match those in the CDTEs, they will be instead added to the original model.

Such shared models may be included into “larger problems” using the INLINE construct inside SOLVESELECT (and the << operator), e.g.,

SOLVESELECT  t(p) AS (SELECT t, NULL::float8 AS p, 15 - t AS price
                      FROM generate_series(1, 10) AS t) 
     INLINE  m    AS (SELECT m << (SOLVEMODEL input AS (SELECT t, p FROM t)
                                         WITH state0(c) AS (SELECT NULL::float8 AS c)) 
                                         FROM battery) 
MINIMIZE  (SELECT sum(p * price) FROM t) 
SUBJECTTO (SELECT c=0 FROM m_state0),
          (SELECT c=5 FROM m_output WHERE t=5)
USING solverlp;

This defines a problem where the model m is imported from the database view battery. Before import, the decision variables (p) from the outer problem are injected into the model (using <<) to declare that the battery model inputs and the initial state are decision variables (to be optimized). In the outer problem (MINIMIZE / SUBJECTTO), the CDTEs from the inner problem are accessible using the “m_” prefix, e.g., “m_output”, which allows accessing and using the components of the inner problem to define the outer problem.