The simple solver with a custom logger example.
This example depends on simple-solver, simple-solver-logging, minimal-cuda-solver.
 
  Introduction
The custom-logger example shows how Ginkgo's API can be leveraged to implement application-specific callbacks for Ginkgo's events. This is the most basic way of extending Ginkgo and a good first step for any application developer who wants to adapt Ginkgo to his specific needs.
Ginkgo's gko::log::Logger abstraction provides hooks to the events that happen during the library execution. These hooks concern any low-level event such as memory allocations, deallocations, copies and kernel launches up to high-level events such as linear operator applications and completion of solver iterations.
In this example, a simple logger is implemented to track the solver's recurrent residual norm and compute the true residual norm. At the end of the solver execution, a comparison table is shown on-screen.
About the example 
Each example has the following sections: 
- 
Introduction:This gives an overview of the example and mentions any interesting aspects in the example that might help the reader. 
- 
The commented program: This section is intended for you to understand the details of the example so that you can play with it and understand Ginkgo and its features better. 
- 
Results: This section shows the results of the code when run. Though the results may not be completely the same, you can expect the behaviour to be similar. 
- 
The plain program: This is the complete code without any comments to have an complete overview of the code. 
 
The commented program
    return mtx->get_executor()->copy_val_to_host(mtx->get_const_values());
}
Utility function which computes the norm of a Ginkgo gko::matrix::Dense vector.
template <typename ValueType>
{
Dense is a matrix format which explicitly stores all values of the matrix.
Definition dense.hpp:120
typename detail::remove_complex_s< T >::type remove_complex
Obtain the type which removed the complex of complex/scalar type or the template parameter of class b...
Definition math.hpp:264
Get the executor of the vector
std::shared_ptr< const Executor > get_executor() const noexcept
Returns the Executor of the object.
Definition polymorphic_object.hpp:243
 Initialize a result scalar containing the value 0.0.
auto b_norm =
        {0.0}, exec);
std::unique_ptr< Matrix > initialize(size_type stride, std::initializer_list< typename Matrix::value_type > vals, std::shared_ptr< const Executor > exec, TArgs &&... create_args)
Creates and initializes a column-vector.
Definition dense.hpp:1574
Use the dense compute_norm2 function to compute the norm.
void compute_norm2(ptr_param< LinOp > result) const
Computes the column-wise Euclidean (L^2) norm of this matrix.
 Use the other utility function to return the norm contained in b_norm
    return get_first_element(b_norm.get());
}
Custom logger class which intercepts the residual norm scalar and solution vector in order to print a table of real vs recurrent (internal to the solvers) residual norms.
template <typename ValueType>
Output the logger's data in a table format
Print a header for the table
std::cout << "Recurrent vs true vs implicit residual norm:"
          << std::endl;
std::cout << '|' << std::setw(10) << "Iteration" << '|' << std::setw(25)
          << "Recurrent Residual Norm" << '|' << std::setw(25)
          << "True Residual Norm" << '|' << std::setw(25)
          << "Implicit Residual Norm" << '|' << std::endl;
Print a separation line. Note that for creating 10 characters std::setw() should be set to 11.
std::cout << '|' << std::setfill('-') << std::setw(11) << '|'
          << std::setw(26) << '|' << std::setw(26) << '|'
          << std::setw(26) << '|' << std::setfill(' ') << std::endl;
Print the data one by one in the form
std::cout << std::scientific;
for (std::size_t i = 0; i < iterations.size(); i++) {
    std::cout << '|' << std::setw(10) << iterations[i] << '|'
              << std::setw(25) << recurrent_norms[i] << '|'
              << std::setw(25) << real_norms[i] << '|' << std::setw(25)
              << implicit_norms[i] << '|' << std::endl;
}
std::defaultfloat could be used here but some compilers do not support it properly, e.g. the Intel compiler
std::cout.unsetf(std::ios_base::floatfield);
Print a separation line
    std::cout << '|' << std::setfill('-') << std::setw(11) << '|'
              << std::setw(26) << '|' << std::setw(26) << '|'
              << std::setw(26) << '|' << std::setfill(' ') << std::endl;
}
 
Customize the logging hook which is called everytime an iteration is completed
                           bool) const override
{
Definition lin_op.hpp:117
An array is a container which encapsulates fixed-sized arrays, stored on the Executor tied to the arr...
Definition array.hpp:166
std::size_t size_type
Integral type used for allocation quantities.
Definition types.hpp:90
 If the solver shares a residual norm, log its value
if (residual_norm) {
std::decay_t< T > * as(U *obj)
Performs polymorphic type conversion.
Definition utils_helper.hpp:307
Add the norm to the recurrent_norms vector
recurrent_norms.push_back(get_first_element(dense_norm));
Otherwise, use the recurrent residual vector
Compute the residual vector's norm
auto norm = compute_norm(dense_residual);
Add the computed norm to the recurrent_norms vector
    recurrent_norms.push_back(norm);
}
If the solver shares the current solution vector
Extract the matrix from the solver
Store the matrix's executor
auto exec = matrix->get_executor();
Create a scalar containing the value 1.0
Create a scalar containing the value -1.0
Instantiate a temporary result variable
detail::cloned_type< Pointer > clone(const Pointer &p)
Creates a unique clone of the object pointed to by p.
Definition utils_helper.hpp:173
 Compute the real residual vector by calling apply on the system matrix
matrix->apply(one, solution, neg_one, res);
Compute the norm of the residual vector and add it to the real_norms vector
    real_norms.push_back(compute_norm(res.get()));
} else {
Add to the real_norms vector the value -1.0 if it could not be computed
    real_norms.push_back(-1.0);
}
 
if (implicit_sq_residual_norm) {
    auto dense_norm =
Add the norm to the implicit_norms vector
    implicit_norms.push_back(std::sqrt(get_first_element(dense_norm)));
} else {
Add to the implicit_norms vector the value -1.0 if it could not be computed
    implicit_norms.push_back(-1.0);
}
Add the current iteration number to the iterations vector
    iterations.push_back(iteration);
}
Construct the logger
    ResidualLogger()
        : 
gko::log::Logger(
gko::log::Logger::iteration_complete_mask)
    {}
 
private:
The Ginkgo namespace.
Definition abstract_factory.hpp:20
Vector which stores all the recurrent residual norms
mutable std::vector<RealValueType> recurrent_norms{};
Vector which stores all the real residual norms
mutable std::vector<RealValueType> real_norms{};
Vector which stores all the implicit residual norms
mutable std::vector<RealValueType> implicit_norms{};
Vector which stores all the iteration numbers
    mutable std::vector<std::size_t> iterations{};
};
 
 
int main(int argc, char* argv[])
{
Use some shortcuts. In Ginkgo, vectors are seen as a gko::matrix::Dense with one column/one row. The advantage of this concept is that using multiple vectors is a now a natural extension of adding columns/rows are necessary.
using ValueType = double;
using IndexType = int;
The gko::matrix::Csr class is used here, but any other matrix class such as gko::matrix::Coo, gko::matrix::Hybrid, gko::matrix::Ell or gko::matrix::Sellp could also be used.
CSR is a matrix format which stores only the nonzero coefficients by compressing each row of the matr...
Definition csr.hpp:126
 The gko::solver::Cg is used here, but any other solver class can also be used.
CG or the conjugate gradient method is an iterative type Krylov subspace method which is suitable for...
Definition cg.hpp:50
 Print the ginkgo version information.
static const version_info & get()
Returns an instance of version_info.
Definition version.hpp:139
  
Where do you want to run your solver ?
The gko::Executor class is one of the cornerstones of Ginkgo. Currently, we have support for an gko::OmpExecutor, which uses OpenMP multi-threading in most of its kernels, a gko::ReferenceExecutor, a single threaded specialization of the OpenMP executor and a gko::CudaExecutor which runs the code on a NVIDIA GPU if available. 
- Note
- With the help of C++, you see that you only ever need to change the executor and all the other functions/ routines within Ginkgo should automatically work and run on the executor with any other changes.
if (argc == 2 && (std::string(argv[1]) == "--help")) {
    std::cerr << "Usage: " << argv[0] << " [executor]" << std::endl;
    std::exit(-1);
}
 
const auto executor_string = argc >= 2 ? argv[1] : "reference";
Figure out where to run the code
std::map<std::string, std::function<std::shared_ptr<gko::Executor>()>>
    exec_map{
        {"cuda",
         [] {
         }},
        {"hip",
         [] {
         }},
        {"dpcpp",
         [] {
         }},
        {"reference", [] { return gko::ReferenceExecutor::create(); }}};
static std::shared_ptr< CudaExecutor > create(int device_id, std::shared_ptr< Executor > master, bool device_reset, allocation_mode alloc_mode=default_cuda_alloc_mode, CUstream_st *stream=nullptr)
Creates a new CudaExecutor.
static std::shared_ptr< DpcppExecutor > create(int device_id, std::shared_ptr< Executor > master, std::string device_type="all", dpcpp_queue_property property=dpcpp_queue_property::in_order)
Creates a new DpcppExecutor.
static std::shared_ptr< HipExecutor > create(int device_id, std::shared_ptr< Executor > master, bool device_reset, allocation_mode alloc_mode=default_hip_alloc_mode, CUstream_st *stream=nullptr)
Creates a new HipExecutor.
static std::shared_ptr< OmpExecutor > create(std::shared_ptr< CpuAllocatorBase > alloc=std::make_shared< CpuAllocator >())
Creates a new OmpExecutor.
Definition executor.hpp:1396
executor where Ginkgo will perform the computation
const auto exec = exec_map.at(executor_string)();  
 
Reading your data and transfer to the proper device.
Read the matrix, right hand side and the initial solution using the read function. 
- Note
- Ginkgo uses C++ smart pointers to automatically manage memory. To this end, we use our own object ownership transfer functions that under the hood call the required smart pointer functions to manage object ownership. gko::share and gko::give are the functions that you would need to use.
const RealValueType reduction_factor = 1e-7;
std::unique_ptr< MatrixType > read(StreamType &&is, MatrixArgs &&... args)
Reads a matrix stored in matrix market format from an input stream.
Definition mtx_io.hpp:159
  
Creating the solver
Generate the gko::solver factory. Ginkgo uses the concept of Factories to build solvers with certain properties. Observe the Fluent interface used here. Here a cg solver is generated with a stopping criteria of maximum iterations of 20 and a residual norm reduction of 1e-15. You also observe that the stopping criteria(gko::stop) are also generated from factories using their build methods. You need to specify the executors which each of the object needs to be built on.
auto solver_gen =
    cg::build()
        .with_criteria(gko::stop::Iteration::build().with_max_iters(20u),
                       gko::stop::ResidualNorm<ValueType>::build()
                           .with_reduction_factor(reduction_factor))
        .on(exec);
Instantiate a ResidualLogger logger.
auto logger = std::make_shared<ResidualLogger<ValueType>>();
Add the previously created logger to the solver factory. The logger will be automatically propagated to all solvers created from this factory.
solver_gen->add_logger(logger);
Generate the solver from the matrix. The solver factory built in the previous step takes a "matrix"(a gko::LinOp to be more general) as an input. In this case we provide it with a full matrix that we previously read, but as the solver only effectively uses the apply() method within the provided "matrix" object, you can effectively create a gko::LinOp class with your own apply implementation to accomplish more tasks. We will see an example of how this can be done in the custom-matrix-format example
auto solver = solver_gen->generate(A);
Finally, solve the system. The solver, being a gko::LinOp, can be applied to a right hand side, b to obtain the solution, x.
Print the solution to the command line.
std::cout << "Solution (x):\n";
write(std::cout, x);
Print the table of the residuals obtained from the logger
To measure if your solution has actually converged, you can measure the error of the solution. one, neg_one are objects that represent the numbers which allow for a uniform interface when computing on any device. To compute the residual, all you need to do is call the apply method, which in this case is an spmv and equivalent to the LAPACK z_spmv routine. Finally, you compute the euclidean 2-norm with the compute_norm2 function.
    A->apply(one, x, neg_one, b);
    b->compute_norm2(res);
 
    std::cout << "Residual norm sqrt(r^T r):\n";
}
void write(StreamType &&os, MatrixPtrType &&matrix, layout_type layout=detail::mtx_io_traits< std::remove_cv_t< detail::pointee< MatrixPtrType > > >::default_layout)
Writes a matrix into an output stream in matrix market format.
Definition mtx_io.hpp:295
  
Results
The following is the expected result:
Solution (x):
%%MatrixMarket matrix array real general
19 1
0.252218
0.108645
0.0662811
0.0630433
0.0384088
0.0396536
0.0402648
0.0338935
0.0193098
0.0234653
0.0211499
0.0196413
0.0199151
0.0181674
0.0162722
0.0150714
0.0107016
0.0121141
0.0123025
Recurrent vs true vs implicit residual norm:
| Iteration|  Recurrent Residual Norm|       True Residual Norm|   Implicit Residual Norm|
|----------|-------------------------|-------------------------|-------------------------|
|         0|             4.358899e+00|             4.358899e+00|             4.358899e+00|
|         1|             2.304548e+00|             2.304548e+00|             2.304548e+00|
|         2|             1.467706e+00|             1.467706e+00|             1.467706e+00|
|         3|             9.848751e-01|             9.848751e-01|             9.848751e-01|
|         4|             7.418330e-01|             7.418330e-01|             7.418330e-01|
|         5|             5.136231e-01|             5.136231e-01|             5.136231e-01|
|         6|             3.841650e-01|             3.841650e-01|             3.841650e-01|
|         7|             3.164394e-01|             3.164394e-01|             3.164394e-01|
|         8|             2.277088e-01|             2.277088e-01|             2.277088e-01|
|         9|             1.703121e-01|             1.703121e-01|             1.703121e-01|
|        10|             9.737220e-02|             9.737220e-02|             9.737220e-02|
|        11|             6.168306e-02|             6.168306e-02|             6.168306e-02|
|        12|             4.541231e-02|             4.541231e-02|             4.541231e-02|
|        13|             3.195304e-02|             3.195304e-02|             3.195304e-02|
|        14|             1.616058e-02|             1.616058e-02|             1.616058e-02|
|        15|             6.570152e-03|             6.570152e-03|             6.570152e-03|
|        16|             2.643669e-03|             2.643669e-03|             2.643669e-03|
|        17|             8.588089e-04|             8.588089e-04|             8.588089e-04|
|        18|             2.864613e-04|             2.864613e-04|             2.864613e-04|
|        19|             1.641952e-15|             2.107881e-15|             1.641952e-15|
|----------|-------------------------|-------------------------|-------------------------|
Residual norm sqrt(r^T r):
%%MatrixMarket matrix array real general
1 1
2.10788e-15
Comments about programming and debugging 
 
The plain program
 
 
#include <ginkgo/ginkgo.hpp>
 
#include <fstream>
#include <map>
#include <iomanip>
#include <ios>
#include <iostream>
#include <string>
#include <vector>
 
 
template <typename ValueType>
{
}
 
 
template <typename ValueType>
{
    auto b_norm =
            {0.0}, exec);
    return get_first_element(b_norm.get());
}
 
template <typename ValueType>
    {
        std::cout << "Recurrent vs true vs implicit residual norm:"
                  << std::endl;
        std::cout << '|' << std::setw(10) << "Iteration" << '|' << std::setw(25)
                  << "Recurrent Residual Norm" << '|' << std::setw(25)
                  << "True Residual Norm" << '|' << std::setw(25)
                  << "Implicit Residual Norm" << '|' << std::endl;
        std::cout << '|' << std::setfill('-') << std::setw(11) << '|'
                  << std::setw(26) << '|' << std::setw(26) << '|'
                  << std::setw(26) << '|' << std::setfill(' ') << std::endl;
        std::cout << std::scientific;
        for (std::size_t i = 0; i < iterations.size(); i++) {
            std::cout << '|' << std::setw(10) << iterations[i] << '|'
                      << std::setw(25) << recurrent_norms[i] << '|'
                      << std::setw(25) << real_norms[i] << '|' << std::setw(25)
                      << implicit_norms[i] << '|' << std::endl;
        }
        std::cout.unsetf(std::ios_base::floatfield);
        std::cout << '|' << std::setfill('-') << std::setw(11) << '|'
                  << std::setw(26) << '|' << std::setw(26) << '|'
                  << std::setw(26) << '|' << std::setfill(' ') << std::endl;
    }
 
    using gko_dense = gko::matrix::Dense<ValueType>;
    using gko_real_dense = gko::matrix::Dense<RealValueType>;
 
 
    void on_iteration_complete(const gko::LinOp* solver, const gko::LinOp* b,
                               const gko::LinOp* solution,
                               const gko::LinOp* residual,
                               const gko::LinOp* residual_norm,
                               const gko::LinOp* implicit_sq_residual_norm,
                               const gko::array<gko::stopping_status>*,
                               bool) const override
    {
        if (residual_norm) {
            recurrent_norms.push_back(get_first_element(dense_norm));
        } else {
            auto norm = compute_norm(dense_residual);
            recurrent_norms.push_back(norm);
        }
 
        if (solution) {
                              ->get_system_matrix();
            auto exec = matrix->get_executor();
            matrix->apply(one, solution, neg_one, res);
 
            real_norms.push_back(compute_norm(res.get()));
        } else {
            real_norms.push_back(-1.0);
        }
 
        if (implicit_sq_residual_norm) {
            auto dense_norm =
            implicit_norms.push_back(std::sqrt(get_first_element(dense_norm)));
        } else {
            implicit_norms.push_back(-1.0);
        }
 
        iterations.push_back(iteration);
    }
 
    ResidualLogger()
        : gko::log::Logger(gko::log::Logger::iteration_complete_mask)
    {}
 
private:
    mutable std::vector<RealValueType> recurrent_norms{};
    mutable std::vector<RealValueType> real_norms{};
    mutable std::vector<RealValueType> implicit_norms{};
    mutable std::vector<std::size_t> iterations{};
};
 
 
int main(int argc, char* argv[])
{
    using ValueType = double;
    using IndexType = int;
 
 
    if (argc == 2 && (std::string(argv[1]) == "--help")) {
        std::cerr << "Usage: " << argv[0] << " [executor]" << std::endl;
        std::exit(-1);
    }
 
    const auto executor_string = argc >= 2 ? argv[1] : "reference";
    std::map<std::string, std::function<std::shared_ptr<gko::Executor>()>>
        exec_map{
            {"cuda",
             [] {
             }},
            {"hip",
             [] {
             }},
            {"dpcpp",
             [] {
             }},
            {"reference", [] { return gko::ReferenceExecutor::create(); }}};
 
    const auto exec = exec_map.at(executor_string)();  
 
    const RealValueType reduction_factor = 1e-7;
 
    auto solver_gen =
        cg::build()
            .with_criteria(gko::stop::Iteration::build().with_max_iters(20u),
                           gko::stop::ResidualNorm<ValueType>::build()
                               .with_reduction_factor(reduction_factor))
            .on(exec);
 
    auto logger = std::make_shared<ResidualLogger<ValueType>>();
 
    solver_gen->add_logger(logger);
 
    auto solver = solver_gen->generate(A);
 
 
 
 
    std::cout << "Solution (x):\n";
 
    logger->write();
 
    A->apply(one, x, neg_one, b);
    b->compute_norm2(res);
 
    std::cout << "Residual norm sqrt(r^T r):\n";
}
const value_type * get_const_values() const noexcept
Returns a pointer to the array of values of the matrix.
Definition dense.hpp:869
@ solver
Solver events.
Definition profiler_hook.hpp:34
constexpr T one()
Returns the multiplicative identity for T.
Definition math.hpp:654
detail::shared_type< OwningPointer > share(OwningPointer &&p)
Marks the object pointed to by p as shared.
Definition utils_helper.hpp:224