Skip to content

Conversation

@bringhurst
Copy link
Contributor

@bringhurst bringhurst commented Feb 8, 2023

Manifest generation (application-level) is pointing to container-level symlinks instead of application-level files.

This will cause issues when multiple containers for an application land on a single host and the container the manifest is pointing to gets cleaned up.

[yyy@zzz filecache]$ cat /data/appcache/application_1668550466222_0989/filecache/10/xxx-0.0.672.tgz/classpath_workspace/manifest.txt|tail -n 2
 /data/appcache/application_1668550466222_0989/container_e114_1668550466222_0989_01_000003/__package/lib/zstd-jni-1.5.2-3.jar

[yyy@zzz filecache]$

This patch is to resolve those symlinks to their application level paths.

JIRA=SAMZA-2774

@bringhurst
Copy link
Contributor Author

Here's a little test script just to verify the behavior:

#!/bin/bash

#set -x

rm -rf test_target_dir
rm -rf test_link_dir

mkdir test_target_dir
touch test_target_dir/target_file

mkdir test_link_dir
ln -s ../test_target_dir/target_file test_link_dir/link_file

file=test_link_dir/link_file
resolved_file=$( cd $(dirname $(readlink $([[ $OSTYPE == linux* ]] && echo "-f") $file)) ; pwd -P)

echo "resolved=" $resolved_file

file=test_target_dir/target_file
resolved_file=$( cd $(dirname $(readlink `[[ $OSTYPE == linux* ]] && echo "-f"` "$file")) ; pwd -P)

echo "original=" $resolved_file

@mynameborat
Copy link
Contributor

Thank you for fixing it.
Can you please update the description according to - https://cwiki.apache.org/confluence/display/SAMZA/SEP-25%3A+PR+Title+And+Description+Guidelines

# Symlinks need to be resolved here, otherwise, the jars listed in the
# manifest below will point at the first container launched instead of
# the jars at the application level.
resolved_file=$( cd $(dirname $(readlink `[[ $OSTYPE == linux* ]] && echo "-f"` "$file")) ; pwd -P)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-f options also works on Mac, why linux only?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

man readlink on linux:

READLINK(1)                                                                                                User Commands                                                                                                READLINK(1)

NAME
       readlink - print resolved symbolic links or canonical file names

SYNOPSIS
       readlink [OPTION]... FILE...

DESCRIPTION
       Print value of a symbolic link or canonical file name

       -f, --canonicalize
              canonicalize by following every symlink in every component of the given name recursively; all but the last component must exist

man readlink on OS X:

STAT(1)                                                                                                 General Commands Manual                                                                                                 STAT(1)

NAME
     stat, readlink – display file status

SYNOPSIS
     stat [-FLnq] [-f format | -l | -r | -s | -x] [-t timefmt] [file ...]
     readlink [-fn] [file ...]

DESCRIPTION
     The stat utility displays information about the file pointed to by file.  Read, write, or execute permissions of the named file are not required, but all directories listed in the pathname leading to the file must be
     searchable.  If no argument is given, stat displays information about the file descriptor for standard input.

     When invoked as readlink, only the target of the symbolic link is printed.  If the given argument is not a symbolic link and the -f option is not specified, readlink will print nothing and exit with an error.  If the -f option
     is specified, the output is canonicalized by following every symlink in every component of the given path recursively.  readlink will resolve both absolute and relative paths, and return the absolute pathname corresponding to
     file.  In this case, the argument does not need to be a symbolic link.

     The information displayed is obtained by calling lstat(2) with the given argument and evaluating the returned structure.  The default format displays the st_dev, st_ino, st_mode, st_nlink, st_uid, st_gid, st_rdev, st_size,
     st_atime, st_mtime, st_ctime, st_birthtime, st_blksize, st_blocks, and st_flags fields, in that order.

     The options are as follows:

     -F      As in ls(1), display a slash (‘/’) immediately after each pathname that is a directory, an asterisk (‘*’) after each that is executable, an at sign (‘@’) after each symbolic link, a percent sign (‘%’) after each
             whiteout, an equal sign (‘=’) after each socket, and a vertical bar (‘|’) after each that is a FIFO.  The use of -F implies -l.

     -L      Use stat(2) instead of lstat(2).  The information reported by stat will refer to the target of file, if file is a symbolic link, and not to file itself.  If the link is broken or the target does not exist, fall back on
             lstat(2) and report information about the link.

     -f format
             Display information using the specified format.  See the Formats section for a description of valid formats.

So readlink exists on both, but the meaning of -f is different.

@mynameborat mynameborat merged commit c43f423 into apache:master Aug 31, 2023
khandelwal-ayush pushed a commit to khandelwal-ayush/samza that referenced this pull request Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants