mirror of
git://gcc.gnu.org/git/gcc.git
synced 2024-12-05 04:29:55 +08:00
0782b01c9e
On some AArch64 bootstrapped builds, we were getting a flaky test because the floating point operations in `get_time` were being fused with the floating point operations in `timevar_accumulate`. This meant that the rounding behaviour of our multiplication with `ticks_to_msec` was different when used in `timer::start` and when performed in `timer::stop`. These extra inaccuracies led to the testcase `g++.dg/ext/timevar1.C` being flaky on some hardware. ------------------------------ Avoiding the inlining which was agreed to be undesirable. Three alternative approaches: 1) Use `-ffp-contract=on` to avoid this particular optimisation. 2) Adjusting the code so that the "tolerance" is always of the order of a "tick". 3) Recording times and elapsed differences in integral values. - Could be in terms of a standard measurement (e.g. nanoseconds or microseconds). - Could be in terms of whatever integral value ("ticks" / secondsµseconds / "clock ticks") is returned from the syscall chosen at configure time. While `-ffp-contract=on` removes the problem that I bumped into, there has been a similar bug on x86 that was to do with a different floating point problem that also happens after `get_time` and `timevar_accumulate` both being inlined into the same function. Hence it seems worth choosing a different approach. Of the two other solutions, recording measurements in integral values seems the most robust against slightly "off" measurements being presented to the user -- even though it could avoid the ICE that creates a flaky test. I considered storing time in whatever units our syscall returns and normalising them at the time we print out rather than normalising them to nanoseconds at the point we record our "current time". The logic being that normalisation could have some rounding affect (e.g. if TICKS_PER_SECOND is 3) that would be taken into account in calculations. I decided against it in order to give the values recorded in `timevar_time_def` some interpretive value so it's easier to read the code. Compared to the small rounding that would represent a tiny amount of time and AIUI can not trigger the same kind of ICE's as we are attempting to fix, said interpretive value seems more valuable. Recording time in microseconds seemed reasonable since all obvious values for ticks and `getrusage` are at microsecond granularity or less precise. That said, since TICKS_PER_SECOND and CLOCKS_PER_SEC are both variables given to use by the host system I was not sure of that enough to make this decision. ------------------------------ timer::all_zero is ignoring rows which are inconsequential to the user and would be printed out as all zeros. Since upon printing rows we convert to the same double value and print out the same precision as before, we return true/false based on the same amount of time as before. timer::print_row casts to a floating point measurement in units of seconds as was printed out before. timer::validate_phases -- I'm printing out nanoseconds here rather than floating point seconds since this is an error message for when things have "gone wrong" printing out the actual nanoseconds that have been recorded seems like the best approach. N.b. since we now print out nanoseconds instead of floating point value the padding requirements are different. Originally we were padding to 24 characters and printing 18 decimal places. This looked odd with the now visually smaller values getting printed. I judged 13 characters (corresponding to 2 hours) to be a reasonable point at which our alignment could start to degrade and this provides a more compact output for the majority of cases (checked by triggering the error case via GDB). ------------------------------ N.b. I use a literal 1000000000 for "NANOSEC_PER_SEC". I believe this would fit in an integer on all hosts that GCC supports, but am not certain there are not strange integer sizes we support hence am pointing it out for special attention during review. ------------------------------ No expected change in generated code. Bootstrapped and regtested on AArch64 with no regressions. Hope this is acceptable -- I had originally planned to use `-ffp-contract` as agreed until I saw mention of the old x86 bug in the same area which was not to do with floating point contraction of operations (PR 99903). gcc/ChangeLog: PR middle-end/110316 PR middle-end/9903 * timevar.cc (NANOSEC_PER_SEC, TICKS_TO_NANOSEC, CLOCKS_TO_NANOSEC, nanosec_to_floating_sec, percent_of): New. (TICKS_TO_MSEC, CLOCKS_TO_MSEC): Remove these macros. (timer::validate_phases): Use integral arithmetic to check validity. (timer::print_row, timer::print): Convert from integral nanoseconds to floating point seconds before printing. (timer::all_zero): Change limit to nanosec count instead of fractional count of seconds. (make_json_for_timevar_time_def): Convert from integral nanoseconds to floating point seconds before recording. * timevar.h (struct timevar_time_def): Update all measurements to use uint64_t nanoseconds rather than seconds stored in a double.
306 lines
7.8 KiB
C++
306 lines
7.8 KiB
C++
/* Timing variables for measuring compiler performance.
|
|
Copyright (C) 2000-2023 Free Software Foundation, Inc.
|
|
Contributed by Alex Samuel <samuel@codesourcery.com>
|
|
|
|
This file is part of GCC.
|
|
|
|
GCC is free software; you can redistribute it and/or modify it
|
|
under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation; either version 3, or (at your option)
|
|
any later version.
|
|
|
|
GCC is distributed in the hope that it will be useful, but WITHOUT
|
|
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
|
|
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
|
|
License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with GCC; see the file COPYING3. If not see
|
|
<http://www.gnu.org/licenses/>. */
|
|
|
|
#ifndef GCC_TIMEVAR_H
|
|
#define GCC_TIMEVAR_H
|
|
|
|
namespace json { class value; }
|
|
|
|
/* Timing variables are used to measure elapsed time in various
|
|
portions of the compiler. Each measures elapsed user, system, and
|
|
wall-clock time, as appropriate to and supported by the host
|
|
system.
|
|
|
|
Timing variables are defined using the DEFTIMEVAR macro in
|
|
timevar.def. Each has an enumeral identifier, used when referring
|
|
to the timing variable in code, and a character string name.
|
|
|
|
Timing variables can be used in two ways:
|
|
|
|
- On the timing stack, using timevar_push and timevar_pop.
|
|
Timing variables may be pushed onto the stack; elapsed time is
|
|
attributed to the topmost timing variable on the stack. When
|
|
another variable is pushed on, the previous topmost variable is
|
|
`paused' until the pushed variable is popped back off.
|
|
|
|
- As a standalone timer, using timevar_start and timevar_stop.
|
|
All time elapsed between the two calls is attributed to the
|
|
variable.
|
|
*/
|
|
|
|
/* This structure stores the various varieties of time that can be
|
|
measured. Times are stored in nanoseconds. The time may be an
|
|
absolute time or a time difference; in the former case, the time
|
|
base is undefined, except that the difference between two times
|
|
produces a valid time difference. */
|
|
|
|
struct timevar_time_def
|
|
{
|
|
/* User time in this process. */
|
|
uint64_t user;
|
|
|
|
/* System time (if applicable for this host platform) in this process. */
|
|
uint64_t sys;
|
|
|
|
/* Wall clock time. */
|
|
uint64_t wall;
|
|
|
|
/* Garbage collector memory. */
|
|
size_t ggc_mem;
|
|
};
|
|
|
|
/* An enumeration of timing variable identifiers. Constructed from
|
|
the contents of timevar.def. */
|
|
|
|
#define DEFTIMEVAR(identifier__, name__) \
|
|
identifier__,
|
|
typedef enum
|
|
{
|
|
TV_NONE,
|
|
#include "timevar.def"
|
|
TIMEVAR_LAST
|
|
}
|
|
timevar_id_t;
|
|
#undef DEFTIMEVAR
|
|
|
|
/* A class to hold all state relating to timing. */
|
|
|
|
class timer;
|
|
|
|
/* The singleton instance of timing state.
|
|
|
|
This is non-NULL if timevars should be used. In GCC, this happens with
|
|
the -ftime-report flag. Hence this is NULL for the common,
|
|
needs-to-be-fast case, with an early reject happening for this being
|
|
NULL. */
|
|
extern timer *g_timer;
|
|
|
|
/* Total amount of memory allocated by garbage collector. */
|
|
extern size_t timevar_ggc_mem_total;
|
|
|
|
extern void timevar_init (void);
|
|
extern void timevar_start (timevar_id_t);
|
|
extern void timevar_stop (timevar_id_t);
|
|
extern bool timevar_cond_start (timevar_id_t);
|
|
extern void timevar_cond_stop (timevar_id_t, bool);
|
|
|
|
/* The public (within GCC) interface for timing. */
|
|
|
|
class timer
|
|
{
|
|
public:
|
|
timer ();
|
|
~timer ();
|
|
|
|
void start (timevar_id_t tv);
|
|
void stop (timevar_id_t tv);
|
|
void push (timevar_id_t tv);
|
|
void pop (timevar_id_t tv);
|
|
bool cond_start (timevar_id_t tv);
|
|
void cond_stop (timevar_id_t tv);
|
|
|
|
void push_client_item (const char *item_name);
|
|
void pop_client_item ();
|
|
|
|
void print (FILE *fp);
|
|
json::value *make_json () const;
|
|
|
|
const char *get_topmost_item_name () const;
|
|
|
|
private:
|
|
/* Private member functions. */
|
|
void validate_phases (FILE *fp) const;
|
|
|
|
struct timevar_def;
|
|
void push_internal (struct timevar_def *tv);
|
|
void pop_internal ();
|
|
static void print_row (FILE *fp,
|
|
const timevar_time_def *total,
|
|
const char *name, const timevar_time_def &elapsed);
|
|
static bool all_zero (const timevar_time_def &elapsed);
|
|
|
|
private:
|
|
typedef hash_map<timevar_def *, timevar_time_def> child_map_t;
|
|
|
|
/* Private type: a timing variable. */
|
|
struct timevar_def
|
|
{
|
|
json::value *make_json () const;
|
|
|
|
/* Elapsed time for this variable. */
|
|
struct timevar_time_def elapsed;
|
|
|
|
/* If this variable is timed independently of the timing stack,
|
|
using timevar_start, this contains the start time. */
|
|
struct timevar_time_def start_time;
|
|
|
|
/* The name of this timing variable. */
|
|
const char *name;
|
|
|
|
/* Nonzero if this timing variable is running as a standalone
|
|
timer. */
|
|
unsigned standalone : 1;
|
|
|
|
/* Nonzero if this timing variable was ever started or pushed onto
|
|
the timing stack. */
|
|
unsigned used : 1;
|
|
|
|
child_map_t *children;
|
|
};
|
|
|
|
/* Private type: an element on the timing stack
|
|
Elapsed time is attributed to the topmost timing variable on the
|
|
stack. */
|
|
struct timevar_stack_def
|
|
{
|
|
/* The timing variable at this stack level. */
|
|
struct timevar_def *timevar;
|
|
|
|
/* The next lower timing variable context in the stack. */
|
|
struct timevar_stack_def *next;
|
|
};
|
|
|
|
/* A class for managing a collection of named timing items, for use
|
|
e.g. by libgccjit for timing client code. This class is declared
|
|
inside timevar.cc to avoid everything using timevar.h
|
|
from needing vec and hash_map. */
|
|
class named_items;
|
|
|
|
private:
|
|
|
|
/* Data members (all private). */
|
|
|
|
/* Declared timing variables. Constructed from the contents of
|
|
timevar.def. */
|
|
timevar_def m_timevars[TIMEVAR_LAST];
|
|
|
|
/* The top of the timing stack. */
|
|
timevar_stack_def *m_stack;
|
|
|
|
/* A list of unused (i.e. allocated and subsequently popped)
|
|
timevar_stack_def instances. */
|
|
timevar_stack_def *m_unused_stack_instances;
|
|
|
|
/* The time at which the topmost element on the timing stack was
|
|
pushed. Time elapsed since then is attributed to the topmost
|
|
element. */
|
|
timevar_time_def m_start_time;
|
|
|
|
/* If non-NULL, for use when timing libgccjit's client code. */
|
|
named_items *m_jit_client_items;
|
|
|
|
friend class named_items;
|
|
};
|
|
|
|
/* Provided for backward compatibility. */
|
|
inline void
|
|
timevar_push (timevar_id_t tv)
|
|
{
|
|
if (g_timer)
|
|
g_timer->push (tv);
|
|
}
|
|
|
|
inline void
|
|
timevar_pop (timevar_id_t tv)
|
|
{
|
|
if (g_timer)
|
|
g_timer->pop (tv);
|
|
}
|
|
|
|
// This is a simple timevar wrapper class that pushes a timevar in its
|
|
// constructor and pops the timevar in its destructor.
|
|
class auto_timevar
|
|
{
|
|
public:
|
|
auto_timevar (timer *t, timevar_id_t tv)
|
|
: m_timer (t),
|
|
m_tv (tv)
|
|
{
|
|
if (m_timer)
|
|
m_timer->push (m_tv);
|
|
}
|
|
|
|
explicit auto_timevar (timevar_id_t tv)
|
|
: m_timer (g_timer)
|
|
, m_tv (tv)
|
|
{
|
|
if (m_timer)
|
|
m_timer->push (m_tv);
|
|
}
|
|
|
|
~auto_timevar ()
|
|
{
|
|
if (m_timer)
|
|
m_timer->pop (m_tv);
|
|
}
|
|
|
|
// Disallow copies.
|
|
auto_timevar (const auto_timevar &) = delete;
|
|
|
|
private:
|
|
timer *m_timer;
|
|
timevar_id_t m_tv;
|
|
};
|
|
|
|
// As above, but use cond_start/stop.
|
|
class auto_cond_timevar
|
|
{
|
|
public:
|
|
auto_cond_timevar (timer *t, timevar_id_t tv)
|
|
: m_timer (t),
|
|
m_tv (tv)
|
|
{
|
|
start ();
|
|
}
|
|
|
|
explicit auto_cond_timevar (timevar_id_t tv)
|
|
: m_timer (g_timer)
|
|
, m_tv (tv)
|
|
{
|
|
start ();
|
|
}
|
|
|
|
~auto_cond_timevar ()
|
|
{
|
|
if (m_timer && !already_running)
|
|
m_timer->cond_stop (m_tv);
|
|
}
|
|
|
|
// Disallow copies.
|
|
auto_cond_timevar (const auto_cond_timevar &) = delete;
|
|
|
|
private:
|
|
void start()
|
|
{
|
|
if (m_timer)
|
|
already_running = m_timer->cond_start (m_tv);
|
|
else
|
|
already_running = false;
|
|
}
|
|
|
|
timer *m_timer;
|
|
timevar_id_t m_tv;
|
|
bool already_running;
|
|
};
|
|
|
|
extern void print_time (const char *, long);
|
|
|
|
#endif /* ! GCC_TIMEVAR_H */
|