Skip to content

Commit f0806d6

Browse files
authored
Redo validation errors in the IPv4 parser
ab0e820 didn't consider the impact on the validation error infrastructure. This also makes some minor additional changes: * Modernizes the "Host parsing" section. * Clarifies that the IPv4 parser cannot be invoked directly. * Clarifies that the IPv6 parser can, but you should still not do it. Fixes #706.
1 parent f98ffbc commit f0806d6

File tree

1 file changed

+28
-22
lines changed

1 file changed

+28
-22
lines changed

url.bs

Lines changed: 28 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -414,7 +414,8 @@ point <a for=/>URLs</a> from <var>A</var> can come from untrusted sources.
414414

415415
<div class=example id=example-host-parsing>
416416
<p>A <a lt="host parser">parse</a>-<a lt="host serializer">serialize</a> roundtrip gives the
417-
following results, depending on the <var>isNotSpecial</var> argument to the <a>host parser</a>:
417+
following results, depending on the <var ignore>isNotSpecial</var> argument to the
418+
<a>host parser</a>:
418419

419420
<table>
420421
<tr>
@@ -732,9 +733,10 @@ to be distinguished.
732733

733734
<h3 id=host-parsing>Host parsing</h3>
734735

736+
<div algorithm>
735737
<p>The <dfn export id=concept-host-parser lt="host parser|host parsing">host parser</dfn> takes a
736738
<a>scalar value string</a> <var>input</var> with an optional boolean <var>isNotSpecial</var>
737-
(default false), and then runs these steps:
739+
(default false), and then runs these steps. They return failure or a <a for=/>host</a>.
738740

739741
<ol>
740742
<li>
@@ -772,11 +774,13 @@ to be distinguished.
772774

773775
<li><p>Return <var>asciiDomain</var>.
774776
</ol>
777+
</div>
775778

776779
<hr>
777780

781+
<div algorithm>
778782
<p>The <dfn>ends in a number checker</dfn> takes an <a>ASCII string</a> <var>input</var> and then
779-
runs these steps:
783+
runs these steps. They return a boolean.
780784

781785
<ol>
782786
<li><p>Let <var>parts</var> be the result of <a>strictly splitting</a> <var>input</var> on
@@ -808,26 +812,24 @@ runs these steps:
808812

809813
<li><p>Return false.
810814
</ol>
815+
</div>
811816

817+
<div algorithm>
812818
<p>The <dfn id=concept-ipv4-parser>IPv4 parser</dfn> takes an <a>ASCII string</a> <var>input</var>
813-
and then runs these steps:
819+
and then runs these steps. They return failure or an <a for=/>IPv4 address</a>.
814820

815-
<ol>
816-
<li>
817-
<p>Let <var>validationError</var> be false.
818-
819-
<p class=note>This uses <var>validationError</var> to track <a>validation errors</a> to avoid
820-
reporting them before we are confident we want to parse <var>input</var> as an IPv4 address as the
821-
<a>host parser</a> almost always invokes the <a>IPv4 parser</a>.
821+
<p class=note>The <a for=/>IPv4 parser</a> is not to be invoked directly. Instead check that the
822+
return value of the <a for=/>host parser</a> is an <a for=/>IPv4 address</a>.
822823

824+
<ol>
823825
<li><p>Let <var>parts</var> be the result of <a>strictly splitting</a> <var>input</var> on
824826
U+002E (.).
825827

826828
<li>
827829
<p>If the last <a for=list>item</a> in <var>parts</var> is the empty string, then:
828830

829831
<ol>
830-
<li><p>Set <var>validationError</var> to true.
832+
<li><p><a>Validation error</a>.
831833

832834
<li><p>If <var>parts</var>'s <a for=list>size</a> is greater than 1, then <a for=list>remove</a>
833835
the last <a for=list>item</a> from <var>parts</var>.
@@ -849,18 +851,11 @@ and then runs these steps:
849851

850852
<li><p>If <var>result</var> is failure, <a>validation error</a>, return failure.
851853

852-
<li><p>If <var>result</var>[1] is true, then set <var>validationError</var> to true.
854+
<li><p>If <var>result</var>[1] is true, <a>validation error</a>.
853855

854856
<li><p><a for=list>Append</a> <var>result</var>[0] to <var>numbers</var>.
855857
</ol>
856858

857-
<li>
858-
<p>If <var>validationError</var> is true, <a>validation error</a>.
859-
860-
<p class="note">At this point each part was parsed into a number and <var>input</var> will be
861-
treated as an IPv4 address (or failure). And therefore error reporting resumes.
862-
</li>
863-
864859
<li><p>If any item in <var>numbers</var> is greater than 255, <a>validation error</a>.
865860

866861
<li><p>If any but the last <a for=list>item</a> in <var>numbers</var> is greater than 255, then
@@ -888,7 +883,9 @@ and then runs these steps:
888883

889884
<li><p>Return <var>ipv4</var>.
890885
</ol>
886+
</div>
891887

888+
<div algorithm>
892889
<p>The <dfn>IPv4 number parser</dfn> takes an <a>ASCII string</a> <var>input</var> and then runs
893890
these steps:
894891

@@ -939,11 +936,16 @@ these steps:
939936

940937
<li><p>Return (<var>output</var>, <var>validationError</var>).
941938
</ol>
939+
</div>
942940

943941
<hr>
944942

943+
<div algorithm>
945944
<p>The <dfn id=concept-ipv6-parser>IPv6 parser</dfn> takes a <a>scalar value string</a>
946-
<var>input</var> and then runs these steps:
945+
<var>input</var> and then runs these steps. They return failure or an <a for=/>IPv6 address</a>.
946+
947+
<p class=note>The <a for=/>IPv6 parser</a> could in theory be invoked directly, but please discuss
948+
actually doing that with the editors of this document first.
947949

948950
<ol>
949951
<li><p>Let <var>address</var> be a new <a>IPv6 address</a> whose <a>IPv6 pieces</a> are all 0.
@@ -1089,11 +1091,14 @@ these steps:
10891091

10901092
<li><p>Return <var>address</var>.
10911093
</ol>
1094+
</div>
10921095

10931096
<hr>
10941097

1098+
<div algorithm>
10951099
<p>The <dfn export id=concept-opaque-host-parser>opaque-host parser</dfn> takes a
1096-
<a>scalar value string</a> <var>input</var>, and then runs these steps:
1100+
<a>scalar value string</a> <var>input</var>, and then runs these steps. They return failure or an
1101+
<a for=/>opaque host</a>.
10971102

10981103
<ol>
10991104
<li><p>If <var>input</var> contains a <a>forbidden host code point</a>,
@@ -1108,6 +1113,7 @@ these steps:
11081113
<li><p>Return the result of running <a for=string>UTF-8 percent-encode</a> on <var>input</var>
11091114
using the <a>C0 control percent-encode set</a>.
11101115
</ol>
1116+
</div>
11111117

11121118

11131119
<h3 id=host-serializing>Host serializing</h3>

0 commit comments

Comments
 (0)