ESPResSo
Extensible Simulation Package for Research on Soft Matter Systems
Loading...
Searching...
No Matches
core/accumulators/Correlator.hpp
Go to the documentation of this file.
1/*
2 * Copyright (C) 2010-2022 The ESPResSo project
3 *
4 * This file is part of ESPResSo.
5 *
6 * ESPResSo is free software: you can redistribute it and/or modify
7 * it under the terms of the GNU General Public License as published by
8 * the Free Software Foundation, either version 3 of the License, or
9 * (at your option) any later version.
10 *
11 * ESPResSo is distributed in the hope that it will be useful,
12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14 * GNU General Public License for more details.
15 *
16 * You should have received a copy of the GNU General Public License
17 * along with this program. If not, see <http://www.gnu.org/licenses/>.
18 */
19
20#pragma once
21
22/** @file
23 *
24 * This module computes correlations (and other two time averages) on
25 * the fly and from files.
26 *
27 * The basic idea is that the user can write arbitrary functions A and B that
28 * can depend on e.g. particle coordinates or whatever state of the MD box.
29 * These functions can be vector-valued, always indicated by dim_A and dim_B.
30 *
31 * The way they can be correlated is can be formulated as a (vector-valued)
32 * operation on A and B. One example would be to calculate the product
33 * component by component, and if A and B are both the particle velocities,
34 * then one would obtain
35 * @f[ <v_1^x v_1^x(t)>, <v_1^y v_1^y(t)>, <v_1^z v_1^z(t)>,
36 * <v_2^x v_2^x(t)>, <v_2^y v_2^y(t)>, <v_2^z v_2^z(t)>, \ldots @f]
37 *
38 * The idea of the algorithm is to not keep all As and Bs in memory, but
39 * feed the algorithm (or the container object) successively by new As and Bs
40 * and new correlation estimates are calculated on the fly. Those As and Bs
41 * which have exceeded a certain age are first compressed (see below) and then
42 * discarded.
43 *
44 * To save memory, increase statistics and make the calculation possible for
45 * many orders of magnitude in time, the order-<i>n</i> blocking algorithm in
46 * @cite frenkel02b (algorithm 9, section 4.4.2 p. 95)
47 * is used. Thus not all As and Bs of the whole "past" are stored but
48 * some of them are blocked. In this implementation, a blocking based on 2 is
49 * always applied: all As and Bs not older than a certain @c tau_lin are stored
50 * as they were, those which are older are stored as average values. All As
51 * and Bs older than <tt>2*tau_lin</tt> are compressed in blocks of four, etc.
52 *
53 * This leads to a hierarchical "history": on level 0 the last @c tau_lin values
54 * are stored. This is done in a cyclic array: the newest is always appended
55 * at the end of the array, and if the array length is reached, values
56 * are appended at the beginning, overwriting older values. We therefore
57 * have to carry the index <tt>newest[0]</tt> which tells, what is the index
58 * of the newest value of A and B.
59 *
60 * As soon as the first row of As and Bs is full, two of them are
61 * compressed by calculating the arithmetic mean, and stored in the
62 * first element on level 1. Then we can overwrite these two values
63 * on level 0, because we have saved them. If necessary,
64 * we produce space on level 0 by compressing to level 1. On
65 * level 1 an array with @c tau_lin entries is available to store
66 * the level-1-compressed values. It is successively filled
67 * and also cyclic. When it is filled, the values are stored
68 * on level 2 and so on.
69 *
70 * This allows to have a "history" over many orders of magnitude
71 * in time, without the full memory effort.
72 *
73 * Correlations are only calculated on each level. For
74 * <tt>tau=1,2,..,tau_lin</tt> the values are taken from level 1.
75 * For <tt>tau=tau_lin, tau_lin+2, .., 2*tau_lin</tt> we take the values
76 * from level 2. On level 2 we also have values for <tt>0,..tau_lin-2</tt>,
77 * but these are discarded as we have "better" estimates on level 1.
78 *
79 * The functions A and B can get extra arguments. This can e.g. be an
80 * integer describing the "type" of the particles to be considered,
81 * or in case of files, it is a file_data_source pointer, which tells
82 * the function from where to read. These arguments have to be a single
83 * pointer to an object that carries all relevant information
84 * to obtain A and B (from the MD Box or somewhere else).
85 *
86 * The correlation has to be initialized with all necessary information, i.e.
87 * all function pointers, the dimensions of A and B and their dimensions, etc.
88 *
89 *
90 * TODO: There is a lot of stuff to do:
91 * - Expand the file_data_source so that one can specify which
92 * columns of the file are to be processed
93 * - calculate an estimate of average values. This might be
94 * even necessary to calculate
95 * @f$\left<\left(A(\tau)-\left<A\right>\right)\left(B(\tau)-\left<B\right>\right)\right>@f$,
96 * which is often probably what people want
97 * - Use the A_args to calculate As and Bs only for particular
98 * particle types (especially and example, so that other people can follow)
99 * - Use the A_args to calculate molecular stuff in combination with
100 * the topology concept
101 * - Write a destructor
102 */
103
104#include "AccumulatorBase.hpp"
105#include "observables/Observable.hpp"
106#include "system/System.hpp"
107
108#include <utils/Vector.hpp>
109
110#include <boost/multi_array.hpp>
111#include <boost/serialization/access.hpp>
112
113#include <cstddef>
114#include <memory>
115#include <string>
116#include <utility>
117#include <vector>
118
119namespace Accumulators {
120
121/** The main correlator class
122 *
123 * Data organization:
124 * We use a ring-like way to manage the data: at the beginning we have a
125 * linear array, which we fill from index 0 to @c tau_lin. The index
126 * <tt>newest[i]</tt> always indicates the latest entry of the hierarchic
127 * "past" For every new entry in is incremented and if @c tau_lin is reached,
128 * it starts again from the beginning.
129 */
131 using obs_ptr = std::shared_ptr<Observables::Observable>;
132 void initialize_operations();
133 void initialize_buffers();
134 void initialize() {
135 initialize_operations();
136 initialize_buffers();
137 }
138
139public:
140 /** The initialization procedure for the correlation object. All important
141 * parameters have to be specified at the same time. They cannot be changed
142 * later, so every instance of the correlation class has to be fed with
143 * correct data from the very beginning.
144 *
145 * @param system The system attached to this correlator
146 * @param delta_N The number of time steps between subsequent updates
147 * @param tau_lin The linear part of the correlation function.
148 * @param tau_max maximal time delay tau to sample
149 * @param obs1 First observable to correlate
150 * @param obs2 Second observable to correlate
151 * @param corr_operation how to correlate the two observables A and B
152 * (this has no default)
153 * @param compress1_ how the A values should be compressed (usually
154 * the linear compression method)
155 * @param compress2_ how the B values should be compressed (usually
156 * the linear compression method)
157 * @param correlation_args_ optional arguments for the correlation function
158 * (currently only used when @p corr_operation is "fcs_acf")
159 *
160 */
162 double tau_max, std::string compress1_, std::string compress2_,
163 std::string corr_operation, obs_ptr obs1, obs_ptr obs2,
164 Utils::Vector3d correlation_args_ = {})
165 : AccumulatorBase(system, delta_N), finalized(false), t(0),
166 m_correlation_args_input(correlation_args_),
167 m_correlation_args(correlation_args_), m_tau_lin(tau_lin),
168 m_dt(system->get_time_step() * static_cast<double>(delta_N)),
169 m_tau_max(tau_max), compressA_name(std::move(compress1_)),
170 compressB_name(std::move(compress2_)),
171 corr_operation_name(std::move(corr_operation)), A_obs(std::move(obs1)),
172 B_obs(std::move(obs2)) {
173 initialize();
174 }
175
176public:
177 /** The function to process a new datapoint of A and B
178 *
179 * First the function finds out if it is necessary to make some space for
180 * the new entries of A and B. Then, if necessary, it compresses old values
181 * of A and B to make room for the new value. Finally, the new values of A
182 * and B are stored in <tt>A[newest[0]]</tt> and <tt>B[newest[0]]</tt>,
183 * where the <tt>newest</tt> indices have been increased before. Finally,
184 * the correlation estimate is updated.
185 * TODO: Not all correlation estimates have to be updated.
186 */
187 void update(boost::mpi::communicator const &comm) override;
188
189 /** At the end of data collection, go through the whole hierarchy and
190 * correlate data left there.
191 */
192 int finalize(boost::mpi::communicator const &comm);
193
194 /** Return correlation result */
195 std::vector<double> get_correlation();
196 std::size_t n_values() const {
197 return m_tau_lin + 1 + (m_tau_lin + 1) / 2 * (m_hierarchy_depth - 1);
198 }
199 std::vector<std::size_t> shape() const override {
200 std::vector<std::size_t> shape = m_shape;
201 shape.insert(shape.begin(), n_values());
202 return shape;
203 }
204 std::vector<int> get_samples_sizes() const {
205 return {n_sweeps.begin(), n_sweeps.end()};
206 }
207 std::vector<double> get_lag_times() const;
208
209 int tau_lin() const { return m_tau_lin; }
210 double tau_max() const { return m_tau_max; }
211 double dt() const { return m_dt; }
212
213 Utils::Vector3d const &correlation_args() const { return m_correlation_args; }
215 m_correlation_args = args;
216 }
217
218 std::string const &compress1() const { return compressA_name; }
219 std::string const &compress2() const { return compressB_name; }
220 std::string const &correlation_operation() const {
221 return corr_operation_name;
222 }
223
224 std::string get_internal_state() const final;
225 void set_internal_state(std::string const &) final;
226
227private:
228 bool finalized; ///< whether the correlation is finalized
229 unsigned int t; ///< global time in number of frames
230
231 Utils::Vector3d m_correlation_args_input;
232 Utils::Vector3d m_correlation_args; ///< additional arguments, which the
233 ///< correlation may need (currently
234 ///< only used by fcs_acf)
235
236 int m_hierarchy_depth; ///< maximum level of data compression
237 int m_tau_lin; ///< number of frames in the linear correlation
238 std::size_t m_dim_corr; ///< number of columns for the correlation
239 double m_dt; ///< time interval at which samples arrive
240 double m_tau_max; ///< maximum time for which the correlation should be
241 ///< calculated
242
243 using multi_array_index_type = boost::multi_array<double, 2>::index;
244
245 std::string compressA_name;
246 std::string compressB_name;
247 std::string corr_operation_name; ///< Name of the correlation operator
248
249 std::shared_ptr<Observables::Observable> A_obs;
250 std::shared_ptr<Observables::Observable> B_obs;
251
252 std::vector<int> tau; ///< time differences
253 boost::multi_array<std::vector<double>, 2> A;
254 boost::multi_array<std::vector<double>, 2> B;
255
256 boost::multi_array<double, 2> result; ///< output quantity
257
258 /// number of correlation sweeps at a particular value of tau
259 std::vector<std::size_t> n_sweeps;
260 /// number of data values already present at a particular value of tau
261 std::vector<long> n_vals;
262 /// index of the newest entry in each hierarchy level
263 std::vector<long> newest;
264
265 std::vector<double> A_accumulated_average; ///< all A values are added up here
266 std::vector<double> B_accumulated_average; ///< all B values are added up here
267 std::size_t n_data; ///< a counter for calculated averages and variances
268
269 std::size_t dim_A; ///< dimensionality of A
270 std::size_t dim_B; ///< dimensionality of B
271 std::vector<std::size_t> m_shape; ///< dimensionality of the correlation
272
273 using correlation_operation_type = std::vector<double> (*)(
274 std::vector<double> const &, std::vector<double> const &,
275 Utils::Vector3d const &);
276
277 correlation_operation_type corr_operation;
278
279 using compression_function = std::vector<double> (*)(
280 std::vector<double> const &A1, std::vector<double> const &A2);
281
282 // compression functions
283 compression_function compressA;
284 compression_function compressB;
285};
286
287} // namespace Accumulators
Vector implementation and trait types for boost qvm interoperability.
std::string get_internal_state() const final
Correlator(::System::System const *system, int delta_N, int tau_lin, double tau_max, std::string compress1_, std::string compress2_, std::string corr_operation, obs_ptr obs1, obs_ptr obs2, Utils::Vector3d correlation_args_={})
The initialization procedure for the correlation object.
std::string const & correlation_operation() const
std::vector< int > get_samples_sizes() const
std::string const & compress1() const
Utils::Vector3d const & correlation_args() const
void set_internal_state(std::string const &) final
std::vector< std::size_t > shape() const override
void set_correlation_args(Utils::Vector3d const &args)
std::vector< double > get_lag_times() const
int finalize(boost::mpi::communicator const &comm)
At the end of data collection, go through the whole hierarchy and correlate data left there.
std::vector< double > get_correlation()
Return correlation result.
std::string const & compress2() const
void update(boost::mpi::communicator const &comm) override
The function to process a new datapoint of A and B.
Main system class.