[lfs-dev] Coreutils 8.25 i18n patch breaks cut

DJ Lucas dj at linuxfromscratch.org
Mon Feb 8 00:05:39 PST 2016



On 2/7/2016 10:27 PM, Chris Staub wrote:
> I just tried using jhalfs with a current CLFS system (using a few things
> from LFS) as a host, and it failed attempting to download packages.
> Specifically, basename complained about an extra parameter, which
> happened to be the package's md5sum. After a bit of debugging, I found
> that the problem comes from this command that jhalfs tries to run:
>
> echo http://download.savannah.gnu.org/releases/acl/acl-2.2.52.src.tar.gz
> ftp://ftp.lfs-matrix.net/pub/clfs/conglomeration/acl/acl-2.2.52.src.tar.gz
> a61415312426e9c2212bd7dc7929abda | cut -d" " -f2
>
> This should result in the the 2nd URL listed, but instead it gives this
> output:
>
> http://download.savannah.gnu.org/releases/acl/acl-2.2.52.src.ta
> ftp://ftp.lfs-matrix.net/pub/clfs/conglomeration/acl/acl-2.2.52
> a61415312426e9c2212bd7dc7929abda
>
> The problem is that the "cut" program isn't giving the expected output.
> After some further experimentation, and checking with William Harrington
> in IRC who got the same results, this problem only occurs when using
> Coreutils 8.25 with the i18n patch that's in LFS. When using 8.24, or
> 8.25 without the patch, I get the expected result and jhalfs runs fine.
>

Figures! As soon as I said something about this patch having not screwed 
anything up in a while. I know better! :-)

> After some further testing, cut appears to screw up whenever the first
> field is >=64 characters. The patch defines MIN_CHUNK in cut.c with a
> value of 64, and when I increased that to 128 I could use cut
> successfully with the first field up to 128 characters. I don't think
> MIN_CHUNK is supposed to be an *upper* limit for field length, but
> apparently that it is being used as.

It's used to increase the buffer size. Increasing the initial state to 
128 (or even 69 for this example) just avoids the error in this 
particular case as the delimiter is found in the buffer without having 
to extend it.

+  found_delimiter = false;
+  do
+    {
+      /* Here always ptr + size == read_pos + nchars_avail.
+         Also nchars_avail > 0 || size < nmax.  */
+
+      mbf_char_t c IF_LINT (= 0);
+        {
+          mbf_getc (c, *stream);
+          if (mb_iseof (c))
+            {
+              /* Return partial line, if any.  */
+              if (read_pos == ptr)
+                goto unlock_done;
+              else
+                break;
+            }
+          if (mb_equal (c, delim1) || mb_equal (c, delim2))
+            found_delimiter = true;
+        }
+
+      /* We always want at least one byte left in the buffer, since we
+         always (unless we get an error while reading the first byte)
+         NUL-terminate the line buffer.  */
+
+      if (!nchars_avail)
+        {
+          /* Grow size proportionally, not linearly, to avoid O(n^2)
+             running time.  */
+          size_t newsize = size < MIN_CHUNK ? size + MIN_CHUNK : 2 * size;
+          mbf_char_t *newptr;
+
+          /* Respect nmax.  This handles possible integer overflow.  */
+          if (! (size < newsize && newsize <= nmax))
+            newsize = nmax;
+
+          if (GETNDELIM2_MAXIMUM < newsize)
+            {
+              size_t newsizemax = GETNDELIM2_MAXIMUM + 1;
+              if (size == newsizemax)
+                goto unlock_done;
+              newsize = newsizemax;
+            }
+          nchars_avail = newsize - (read_pos - ptr);
+          newptr = realloc (ptr, newsize * sizeof (mbf_char_t));
+          if (!newptr)
+            goto unlock_done;
+          ptr = newptr;
+          size = newsize;
+          read_pos = size - nchars_avail + ptr;
+        }

Notice that what was printed to screen was exactly 63 bytes? I don't 
understand it completely. Somebody else will have to take a look. 
Setting MIN_CHUNK to exactly 69 works (to account for the length of the 
first string +1), but the second string is 73 bytes/chars. Also, there 
is an urelated overflow that should be fixed there. This is consistent 
with what Fedora has now, but they are reworking the patch from: 
http://pkgs.fedoraproject.org/cgit/rpms/coreutils.git/plain/rh_i18n_wip.tar.gz 


--DJ




More information about the lfs-dev mailing list