Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Temporary directory to clone GitHub repository
ghrepos
5 changes: 3 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
# TODO: Install git

FROM python:3.7.3-alpine3.9

# install java etc
RUN apk update
RUN apk --no-cache add tar wget openjdk8 gcc pkgconfig zeromq zeromq-dev musl-dev
RUN apk --no-cache add tar wget openjdk8 gcc pkgconfig zeromq zeromq-dev musl-dev git

# install python package
RUN pip install jupyter click html2text
Expand All @@ -15,7 +17,6 @@ RUN mkdir -p /user/local/redpen
RUN mv redpen-distribution-1.10.1 /usr/local/redpen
# RUN mv redpen-redpen-1.10.2 /usr/local/redpen


# add redpen to PATH
ENV PATH="/usr/local/redpen/bin:${PATH}"

Expand Down
40 changes: 23 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,38 @@

[tensorflow/docs](https://github.com/tensorflow/docs)の日本語訳の表記ゆれ等をチェックするツールです。

# Usage
## Usage

```
$ git clone https://github.com/tensorflow/docs
$ cd docs/
$ git clone https://github.com/tfug/proofreading proofreading
$ cd proofreading
$ bin/run-check # run text lint on the Docker container
$ bin/clear-output # remove temporary files
This tool works to

1. Clone GitHub repository
2. Convert `*.ipynb` to `*.md` with `jupyter nbconvert`
3. Apply RedPen to `*.md`
4. Output the result to a text file

Basic usage is as below:

```bash
$ ./bin/run ${REPOSITORY} ${BRANCH} ${OUTPUT_FILE}
```

If you would like to check one specific translated file,
please give the relative path from tensorflow/docs as argument of `bin/run-check` command as below.
### Without Docker

```bash
$ ./bin/run tensorflow/docs master result.txt
```
$ bin/run-check site/ja/tutorials/keras/index.md

### With Docker

If you would like to use Docker, you can also execute the proofreading as

```bash
$ ./bin/run-docker tensorflow/docs master result.txt
```

# Why use RedPen?
## Why use RedPen?

We are working on translation with more than one person. So It is expected that a lot of orthographical variants will occur.
[Redpen](http://redpen.cc/) is a proofreading tool to help writing documents that need to adhere to a writing standard.
We can guarantee the quality of documents without lose writing speed while distributing translation tasks among multiple people.
RedPen officially support English and Japanese, but we can use some of the functions with another language.


checking process consists of the following two parts.
1. run `jupyter nbconvert` to convert jupyter notebook to markdown
2. run `redpen` to read proofs
2 changes: 1 addition & 1 deletion bin/build-docker
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#!/bin/bash
docker build --no-cache -t tfug/proofreading .

docker build --no-cache -t tfug/proofreading .
1 change: 0 additions & 1 deletion bin/clear-output

This file was deleted.

45 changes: 45 additions & 0 deletions bin/run
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/bin/bash

# Check the number of arguments
if [ $# -ne 3 ]; then
echo "Error: Invalid arguments"
echo "Usage: ./bin/run.sh <repository name> <branch name> <output file>"
exit 1
fi

GITHUB_REPOSITORY=${1}
GITHUB_REPOSITORY_URL="https://github.com/${GITHUB_REPOSITORY}"
BRANCH=${2}
OUTPUT_FILE=${3}

echo "GITHUB_REPOSITORY: ${GITHUB_REPOSITORY}"
echo "GITHUB_REPOSITORY_URL: ${GITHUB_REPOSITORY_URL}"
echo "BRANCH: ${BRANCH}"
echo "OUTPUT_FILE: ${OUTPUT_FILE}"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

細かいですが、引数の数を減らす意味で、OUTPUT_FILEのファイル名は固定でもいいのかなと思いました。


TEMP_DIR="ghrepos"

# Remove temporary directory
rm -rf ${TEMP_DIR}
mkdir ${TEMP_DIR}

# Clone GitHub repository
git clone -b ${BRANCH} ${GITHUB_REPOSITORY_URL} ${TEMP_DIR}/${GITHUB_REPOSITORY}

# Convert all notebooks to markdowns
notebooks=`find ${TEMP_DIR}/${GITHUB_REPOSITORY}/site/ja -type f | grep .ipynb`
for notebook in ${notebooks}; do
jupyter nbconvert --to markdown ${notebook}
done

# Create output file
echo "GITHUB_REPOSITORY: ${GITHUB_REPOSITORY}" > "${OUTPUT_FILE}"
echo "BRANCH: ${BRANCH}" >> "${OUTPUT_FILE}"
echo "" >> "${OUTPUT_FILE}"

# Apply RedPen to all markdowns
files=`find ${TEMP_DIR}/${GITHUB_REPOSITORY}/site/ja -type f | grep .md`
for file in ${files}; do
echo "[${file}]" >> "${OUTPUT_FILE}"
redpen --result-format plain2 ${file} >> "${OUTPUT_FILE}"
done
7 changes: 0 additions & 7 deletions bin/run-check

This file was deleted.

8 changes: 8 additions & 0 deletions bin/run-docker
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

docker run \
-it \
--rm \
-v $(PWD):/usr/local/documents \
tfug/proofreading \
/bin/ash ./bin/run ${1} ${2} ${3}
80 changes: 0 additions & 80 deletions proofreading.sh

This file was deleted.

32 changes: 0 additions & 32 deletions src/html_converter.py

This file was deleted.