postgresql/contrib/postgres_fdw/postgres_fdw.h

215 lines
7.3 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* postgres_fdw.h
* Foreign-data wrapper for remote PostgreSQL servers
*
* Portions Copyright (c) 2012-2021, PostgreSQL Global Development Group
*
* IDENTIFICATION
* contrib/postgres_fdw/postgres_fdw.h
*
*-------------------------------------------------------------------------
*/
#ifndef POSTGRES_FDW_H
#define POSTGRES_FDW_H
#include "foreign/foreign.h"
#include "lib/stringinfo.h"
#include "libpq-fe.h"
#include "nodes/pathnodes.h"
#include "utils/relcache.h"
/*
* FDW-specific planner information kept in RelOptInfo.fdw_private for a
* postgres_fdw foreign table. For a baserel, this struct is created by
* postgresGetForeignRelSize, although some fields are not filled till later.
* postgresGetForeignJoinPaths creates it for a joinrel, and
* postgresGetForeignUpperPaths creates it for an upperrel.
*/
typedef struct PgFdwRelationInfo
{
/*
* True means that the relation can be pushed down. Always true for simple
* foreign scan.
*/
bool pushdown_safe;
/*
* Restriction clauses, divided into safe and unsafe to pushdown subsets.
* All entries in these lists should have RestrictInfo wrappers; that
* improves efficiency of selectivity and cost estimation.
*/
List *remote_conds;
List *local_conds;
/* Actual remote restriction clauses for scan (sans RestrictInfos) */
List *final_remote_exprs;
/* Bitmap of attr numbers we need to fetch from the remote server. */
Bitmapset *attrs_used;
/* True means that the query_pathkeys is safe to push down */
bool qp_is_pushdown_safe;
/* Cost and selectivity of local_conds. */
QualCost local_conds_cost;
Selectivity local_conds_sel;
/* Selectivity of join conditions */
Selectivity joinclause_sel;
postgres_fdw: Fix costing of pre-sorted foreign paths with local stats. Commit aa09cd242 modified estimate_path_cost_size() so that it reuses cached costs of a basic foreign path for a given foreign-base/join relation when costing pre-sorted foreign paths for that relation, but it incorrectly re-computed retrieved_rows, an estimated number of rows fetched from the remote side, which is needed for costing both the basic and pre-sorted foreign paths. To fix, handle retrieved_rows the same way as the cached costs: store in that relation's fpinfo the retrieved_rows estimate computed for costing the basic foreign path, and reuse it when costing the pre-sorted foreign paths. Also, reuse the rows/width estimates stored in that relation's fpinfo when costing the pre-sorted foreign paths, to make the code consistent. In commit ffab494a4, to extend the costing mentioned above to the foreign-grouping case, I made a change to add_foreign_grouping_paths() to store in a given foreign-grouped relation's RelOptInfo the rows estimate for that relation for reuse, but this patch makes that change unnecessary since we already store the row estimate in that relation's fpinfo, which this patch reuses when costing a foreign path for that relation with the sortClause ordering; remove that change. In passing, fix thinko in commit 7012b132d: in estimate_path_cost_size(), the width estimate for a given foreign-grouped relation to be stored in that relation's fpinfo was reset incorrectly when costing a basic foreign path for that relation with local stats. Apply the patch to HEAD only to avoid destabilizing existing plan choices. Author: Etsuro Fujita Discussion: https://postgr.es/m/CAPmGK17jaJLPDEkgnP2VmkOg=5wT8YQ1CqssU8JRpZ_NSE+dqQ@mail.gmail.com
2019-06-14 19:49:59 +08:00
/* Estimated size and cost for a scan, join, or grouping/aggregation. */
double rows;
int width;
Cost startup_cost;
Cost total_cost;
postgres_fdw: Fix costing of pre-sorted foreign paths with local stats. Commit aa09cd242 modified estimate_path_cost_size() so that it reuses cached costs of a basic foreign path for a given foreign-base/join relation when costing pre-sorted foreign paths for that relation, but it incorrectly re-computed retrieved_rows, an estimated number of rows fetched from the remote side, which is needed for costing both the basic and pre-sorted foreign paths. To fix, handle retrieved_rows the same way as the cached costs: store in that relation's fpinfo the retrieved_rows estimate computed for costing the basic foreign path, and reuse it when costing the pre-sorted foreign paths. Also, reuse the rows/width estimates stored in that relation's fpinfo when costing the pre-sorted foreign paths, to make the code consistent. In commit ffab494a4, to extend the costing mentioned above to the foreign-grouping case, I made a change to add_foreign_grouping_paths() to store in a given foreign-grouped relation's RelOptInfo the rows estimate for that relation for reuse, but this patch makes that change unnecessary since we already store the row estimate in that relation's fpinfo, which this patch reuses when costing a foreign path for that relation with the sortClause ordering; remove that change. In passing, fix thinko in commit 7012b132d: in estimate_path_cost_size(), the width estimate for a given foreign-grouped relation to be stored in that relation's fpinfo was reset incorrectly when costing a basic foreign path for that relation with local stats. Apply the patch to HEAD only to avoid destabilizing existing plan choices. Author: Etsuro Fujita Discussion: https://postgr.es/m/CAPmGK17jaJLPDEkgnP2VmkOg=5wT8YQ1CqssU8JRpZ_NSE+dqQ@mail.gmail.com
2019-06-14 19:49:59 +08:00
/*
* Estimated number of rows fetched from the foreign server, and costs
* excluding costs for transferring those rows from the foreign server.
* These are only used by estimate_path_cost_size().
*/
double retrieved_rows;
Cost rel_startup_cost;
Cost rel_total_cost;
/* Options extracted from catalogs. */
bool use_remote_estimate;
Cost fdw_startup_cost;
Cost fdw_tuple_cost;
List *shippable_extensions; /* OIDs of shippable extensions */
/* Cached catalog information. */
ForeignTable *table;
ForeignServer *server;
UserMapping *user; /* only set in use_remote_estimate mode */
2016-06-10 06:02:36 +08:00
int fetch_size; /* fetch size for this remote table */
/*
Make postgres_fdw's "Relations" output agree with the rest of EXPLAIN. The relation aliases shown in the "Relations" line for a foreign scan didn't always agree with those used in the rest of EXPLAIN's output. The regression test result changes appearing here provide examples. It's really impossible for postgres_fdw to duplicate EXPLAIN's alias assignment logic during postgresGetForeignRelSize(), because of the de-duplication that EXPLAIN does on a global basis --- and anyway, trying to duplicate that would be unmaintainable. Instead, just put numeric rangetable indexes into the string, and convert those to table names/aliases in postgresExplainForeignScan, which does have access to the results of ruleutils.c's alias assignment logic. Aside from being more reliable, this shifts some work from planning to EXPLAIN, which is a good tradeoff for performance. (I also changed from using StringInfo to using psprintf, which makes the code slightly simpler and reduces its memory consumption.) A kluge required by this solution is that we have to reverse-engineer the rtoffset applied by setrefs.c. If that logic ever fails (presumably because the member tables of a join got offset by different amounts), we'll need some more cooperation with setrefs.c to keep things straight. But for now, there's no need for that. Arguably this is a back-patchable bug fix, but since this is a mostly cosmetic issue and there have been no field complaints, I'll refrain for now. Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us
2019-12-03 05:31:03 +08:00
* Name of the relation, for use while EXPLAINing ForeignScan. It is used
* for join and upper relations but is set for all relations. For a base
* relation, this is really just the RT index as a string; we convert that
* while producing EXPLAIN output. For join and upper relations, the name
* indicates which base foreign tables are included and the join type or
* aggregation type used.
*/
Make postgres_fdw's "Relations" output agree with the rest of EXPLAIN. The relation aliases shown in the "Relations" line for a foreign scan didn't always agree with those used in the rest of EXPLAIN's output. The regression test result changes appearing here provide examples. It's really impossible for postgres_fdw to duplicate EXPLAIN's alias assignment logic during postgresGetForeignRelSize(), because of the de-duplication that EXPLAIN does on a global basis --- and anyway, trying to duplicate that would be unmaintainable. Instead, just put numeric rangetable indexes into the string, and convert those to table names/aliases in postgresExplainForeignScan, which does have access to the results of ruleutils.c's alias assignment logic. Aside from being more reliable, this shifts some work from planning to EXPLAIN, which is a good tradeoff for performance. (I also changed from using StringInfo to using psprintf, which makes the code slightly simpler and reduces its memory consumption.) A kluge required by this solution is that we have to reverse-engineer the rtoffset applied by setrefs.c. If that logic ever fails (presumably because the member tables of a join got offset by different amounts), we'll need some more cooperation with setrefs.c to keep things straight. But for now, there's no need for that. Arguably this is a back-patchable bug fix, but since this is a mostly cosmetic issue and there have been no field complaints, I'll refrain for now. Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us
2019-12-03 05:31:03 +08:00
char *relation_name;
/* Join information */
RelOptInfo *outerrel;
RelOptInfo *innerrel;
JoinType jointype;
/* joinclauses contains only JOIN/ON conditions for an outer join */
List *joinclauses; /* List of RestrictInfo */
/* Upper relation information */
UpperRelationKind stage;
/* Grouping information */
List *grouped_tlist;
/* Subquery information */
bool make_outerrel_subquery; /* do we deparse outerrel as a
* subquery? */
bool make_innerrel_subquery; /* do we deparse innerrel as a
* subquery? */
Relids lower_subquery_rels; /* all relids appearing in lower
* subqueries */
/*
* Index of the relation. It is used to create an alias to a subquery
* representing the relation.
*/
int relation_index;
} PgFdwRelationInfo;
/* in postgres_fdw.c */
extern int set_transmission_modes(void);
extern void reset_transmission_modes(int nestlevel);
/* in connection.c */
extern PGconn *GetConnection(UserMapping *user, bool will_prep_stmt);
extern void ReleaseConnection(PGconn *conn);
extern unsigned int GetCursorNumber(PGconn *conn);
extern unsigned int GetPrepStmtNumber(PGconn *conn);
extern PGresult *pgfdw_get_result(PGconn *conn, const char *query);
extern PGresult *pgfdw_exec_query(PGconn *conn, const char *query);
extern void pgfdw_report_error(int elevel, PGresult *res, PGconn *conn,
bool clear, const char *sql);
/* in option.c */
extern int ExtractConnectionOptions(List *defelems,
const char **keywords,
const char **values);
extern List *ExtractExtensionList(const char *extensionsString,
bool warnOnMissing);
/* in deparse.c */
extern void classifyConditions(PlannerInfo *root,
RelOptInfo *baserel,
List *input_conds,
List **remote_conds,
List **local_conds);
extern bool is_foreign_expr(PlannerInfo *root,
RelOptInfo *baserel,
Expr *expr);
Avoid postgres_fdw crash for a targetlist entry that's just a Param. foreign_grouping_ok() is willing to put fairly arbitrary expressions into the targetlist of a remote SELECT that's doing grouping or aggregation on the remote side, including expressions that have no foreign component to them at all. This is possibly a bit dubious from an efficiency standpoint; but it rises to the level of a crash-causing bug if the expression is just a Param or non-foreign Var. In that case, the expression will necessarily also appear in the fdw_exprs list of values we need to send to the remote server, and then setrefs.c's set_foreignscan_references will mistakenly replace the fdw_exprs entry with a Var referencing the targetlist result. The root cause of this problem is bad design in commit e7cb7ee14: it put logic into set_foreignscan_references that IMV is postgres_fdw-specific, and yet this bug shows that it isn't postgres_fdw-specific enough. The transformation being done on fdw_exprs assumes that fdw_exprs is to be evaluated with the fdw_scan_tlist as input, which is not how postgres_fdw uses it; yet it could be the right thing for some other FDW. (In the bigger picture, setrefs.c has no business assuming this for the other expression fields of a ForeignScan either.) The right fix therefore would be to expand the FDW API so that the FDW could inform setrefs.c how it intends to evaluate these various expressions. We can't change that in the back branches though, and we also can't just summarily change setrefs.c's behavior there, or we're likely to break external FDWs. As a stopgap, therefore, hack up postgres_fdw so that it won't attempt to send targetlist entries that look exactly like the fdw_exprs entries they'd produce. In most cases this actually produces a superior plan, IMO, with less data needing to be transmitted and returned; so we probably ought to think harder about whether we should ship tlist expressions at all when they don't contain any foreign Vars or Aggs. But that's an optimization not a bug fix so I left it for later. One case where this produces an inferior plan is where the expression in question is actually a GROUP BY expression: then the restriction prevents us from using remote grouping. It might be possible to work around that (since that would reduce to group-by-a-constant on the remote side); but it seems like a pretty unlikely corner case, so I'm not sure it's worth expending effort solely to improve that. In any case the right long-term answer is to fix the API as sketched above, and then revert this hack. Per bug #15781 from Sean Johnston. Back-patch to v10 where the problem was introduced. Discussion: https://postgr.es/m/15781-2601b1002bad087c@postgresql.org
2019-04-28 01:15:54 +08:00
extern bool is_foreign_param(PlannerInfo *root,
RelOptInfo *baserel,
Expr *expr);
extern void deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
Index rtindex, Relation rel,
List *targetAttrs, bool doNothing,
List *withCheckOptionList, List *returningList,
List **retrieved_attrs, int *values_end_len);
extern void rebuildInsertSql(StringInfo buf, char *orig_query,
int values_end_len, int num_cols,
int num_rows);
extern void deparseUpdateSql(StringInfo buf, RangeTblEntry *rte,
Index rtindex, Relation rel,
List *targetAttrs,
List *withCheckOptionList, List *returningList,
List **retrieved_attrs);
extern void deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
RelOptInfo *foreignrel,
List *targetlist,
List *targetAttrs,
List *remote_conds,
List **params_list,
List *returningList,
List **retrieved_attrs);
extern void deparseDeleteSql(StringInfo buf, RangeTblEntry *rte,
Index rtindex, Relation rel,
List *returningList,
List **retrieved_attrs);
extern void deparseDirectDeleteSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
RelOptInfo *foreignrel,
List *remote_conds,
List **params_list,
List *returningList,
List **retrieved_attrs);
extern void deparseAnalyzeSizeSql(StringInfo buf, Relation rel);
extern void deparseAnalyzeSql(StringInfo buf, Relation rel,
List **retrieved_attrs);
extern void deparseStringLiteral(StringInfo buf, const char *val);
extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
extern Expr *find_em_expr_for_input_target(PlannerInfo *root,
EquivalenceClass *ec,
PathTarget *target);
extern List *build_tlist_to_deparse(RelOptInfo *foreignrel);
extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root,
RelOptInfo *foreignrel, List *tlist,
List *remote_conds, List *pathkeys,
bool has_final_sort, bool has_limit,
bool is_subquery,
List **retrieved_attrs, List **params_list);
extern const char *get_jointype_name(JoinType jointype);
/* in shippable.c */
extern bool is_builtin(Oid objectId);
extern bool is_shippable(Oid objectId, Oid classId, PgFdwRelationInfo *fpinfo);
Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-22 03:18:54 +08:00
#endif /* POSTGRES_FDW_H */