Jakub Jelinek 0b8c57ed40 libcpp: Add -Winvalid-utf8 warning [PR106655]
The following patch introduces a new warning - -Winvalid-utf8 similarly
to what clang now has - to diagnose invalid UTF-8 byte sequences in
comments, but not just in those, but also in string/character literals
and outside of them.

The warning is on by default when explicit -finput-charset=UTF-8 is
used and C++23 compilation is requested and if -{,W}pedantic or
-pedantic-errors it is actually a pedwarn.

The reason it is on by default only for -finput-charset=UTF-8 is
that the sources often are UTF-8, but sometimes could be some ASCII
compatible single byte encoding where non-ASCII characters only
appear in comments.  So having the warning off by default
is IMO desirable.  The C++23 pedantic mode for when the source code
is UTF-8 is -std=c++23 -pedantic-errors -finput-charset=UTF-8.

2022-09-01  Jakub Jelinek  <jakub@redhat.com>

	PR c++/106655
libcpp/
	* include/cpplib.h (struct cpp_options): Implement C++23
	P2295R6 - Support for UTF-8 as a portable source file encoding.
	Add cpp_warn_invalid_utf8 and cpp_input_charset_explicit fields.
	(enum cpp_warning_reason): Add CPP_W_INVALID_UTF8 enumerator.
	* init.cc (cpp_create_reader): Initialize cpp_warn_invalid_utf8
	and cpp_input_charset_explicit.
	* charset.cc (_cpp_valid_utf8): Adjust function comment.
	* lex.cc (UCS_LIMIT): Define.
	(utf8_continuation): New const variable.
	(utf8_signifier): Move earlier in the file.
	(_cpp_warn_invalid_utf8, _cpp_handle_multibyte_utf8): New functions.
	(_cpp_skip_block_comment): Handle -Winvalid-utf8 warning.
	(skip_line_comment): Likewise.
	(lex_raw_string, lex_string): Likewise.
	(_cpp_lex_direct): Likewise.
gcc/
	* doc/invoke.texi (-Winvalid-utf8): Document it.
gcc/c-family/
	* c.opt (-Winvalid-utf8): New warning.
	* c-opts.cc (c_common_handle_option) <case OPT_finput_charset_>:
	Set cpp_opts->cpp_input_charset_explicit.
	(c_common_post_options): If -finput-charset=UTF-8 is explicit
	in C++23, enable -Winvalid-utf8 by default and if -pedantic
	or -pedantic-errors, make it a pedwarn.
gcc/testsuite/
	* c-c++-common/cpp/Winvalid-utf8-1.c: New test.
	* c-c++-common/cpp/Winvalid-utf8-2.c: New test.
	* c-c++-common/cpp/Winvalid-utf8-3.c: New test.
	* g++.dg/cpp23/Winvalid-utf8-1.C: New test.
	* g++.dg/cpp23/Winvalid-utf8-2.C: New test.
	* g++.dg/cpp23/Winvalid-utf8-3.C: New test.
	* g++.dg/cpp23/Winvalid-utf8-4.C: New test.
	* g++.dg/cpp23/Winvalid-utf8-5.C: New test.
	* g++.dg/cpp23/Winvalid-utf8-6.C: New test.
	* g++.dg/cpp23/Winvalid-utf8-7.C: New test.
	* g++.dg/cpp23/Winvalid-utf8-8.C: New test.
	* g++.dg/cpp23/Winvalid-utf8-9.C: New test.
	* g++.dg/cpp23/Winvalid-utf8-10.C: New test.
	* g++.dg/cpp23/Winvalid-utf8-11.C: New test.
	* g++.dg/cpp23/Winvalid-utf8-12.C: New test.
2022-09-01 09:56:44 +02:00
2022-03-19 00:16:22 +00:00
2022-09-01 00:17:39 +00:00
2022-09-01 00:17:39 +00:00
2022-09-01 00:17:39 +00:00
2022-09-01 00:17:39 +00:00
2022-08-31 00:16:45 +00:00
2022-07-13 00:16:33 +00:00
2021-11-30 00:16:44 +00:00
2022-08-26 00:16:21 +00:00
2022-08-31 00:16:45 +00:00
2022-07-09 00:16:54 +00:00
2022-06-28 00:16:58 +00:00
2022-06-04 00:16:27 +00:00
2022-05-21 00:16:32 +00:00
2021-11-16 00:16:31 +00:00
2022-09-01 00:17:39 +00:00
2022-08-27 00:17:09 +00:00
2022-07-30 10:35:23 -07:00
2022-08-27 00:17:09 +00:00
2022-08-26 00:16:21 +00:00
2022-09-01 00:17:39 +00:00
2022-08-26 00:16:21 +00:00
2022-08-26 00:16:21 +00:00
2022-08-28 00:16:28 +00:00
2022-08-26 00:16:21 +00:00
2022-09-01 00:17:39 +00:00
2022-08-26 00:16:21 +00:00
2022-09-01 00:17:39 +00:00
2022-08-26 00:16:21 +00:00
2022-08-02 00:16:51 +00:00
2022-07-29 00:16:21 +00:00
2022-08-26 00:16:21 +00:00
2022-07-19 17:07:04 +03:00
2022-09-01 00:17:39 +00:00
2021-12-21 09:10:57 +01:00
2022-08-04 13:38:28 -07:00

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.
Description
No description provided
Readme 2.1 GiB
Languages
C++ 31.9%
C 31.3%
Ada 12%
D 6.5%
Go 6.4%
Other 11.5%