Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .env
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Version pins — single source of truth for image tags and the
# fess_config.properties base fetched by bin/render-fess-config.sh.
# Docker Compose auto-loads this file for ${VAR} substitution in compose.yaml.
# NOTE: version pins only — do NOT put secrets here
# (use conf/fess_config.local.properties for those).
FESS_VERSION=15.7.0
OPENSEARCH_VERSION=3.7.0
44 changes: 33 additions & 11 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,11 +1,33 @@
/data/https-portal/ssl_certs
/data/fess/home/fess
#/data/fess/opt/fess
/data/fess/var/lib/fess
/data/fess/var/log/fess
/data/fess/usr/share/fess/app/WEB-INF/plugin
/data/fess/usr/share/fess/app/WEB-INF/view/codesearch
/data/fess/usr/share/fess/app/css/codesearch
/data/fess/usr/share/fess/app/images/codesearch
/data/elasticsearch/usr/share/elasticsearch/data
/data/elasticsearch/usr/share/elasticsearch/config/dictionary
# ===== OS / editor =====
.DS_Store
*.swp
*~

# ===== Local scratch / backups (may contain credentials) =====
/tmp/
*.bulk

# ===== Local fess_config overrides (secrets: cipher key, password, ...) =====
/conf/fess_config.local.properties

# ===== Fess: runtime & generated (bind-mounted) =====
/data/fess/home/fess/
/data/fess/var/
/data/fess/usr/share/fess/app/WEB-INF/plugin/
/data/fess/themes/

# /opt/fess: ignore generated/live config, keep the tracked template
/data/fess/opt/fess/*
!/data/fess/opt/fess/system.properties.template

# Legacy JSP theme overrides & static assets
# (not mounted by compose; superseded by the fess-themes static theme)
/data/fess/usr/share/fess/app/WEB-INF/view/
/data/fess/usr/share/fess/app/css/
/data/fess/usr/share/fess/app/images/

# ===== OpenSearch: runtime data & dictionary (bind-mounted) =====
/data/opensearch/

# ===== https-portal: Let's Encrypt / TLS state (keep conf/*.erb) =====
/data/https-portal/ssl_certs/
133 changes: 123 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,16 @@

* [codesearch.codelibs.org](https://codesearch.codelibs.org/)

## Architecture / Theme Model

- **Theme**: Fess 15.7 static theme system — `theme.default=codesearch` in `system.properties` selects the codesearch theme. No virtual-host routing is needed for theme activation.
- **Fess config (`fess_config.properties`)**: `setup.sh` generates `data/fess/opt/fess/fess_config.properties` from the upstream base for the pinned Fess version plus the codesearch overlay (`conf/fess_config.overlay.properties`) and an optional local override (`conf/fess_config.local.properties`). It is mounted at `/opt/fess`, which the image places ahead of its `/etc/fess` default on the classpath, so the generated file takes effect. Only the delta is tracked in git; the base auto-tracks the pinned version. See [Configuration](#configuration).
- **Version pins (`.env`)**: `FESS_VERSION` / `OPENSEARCH_VERSION` are the single source of truth for the image tags (`compose.yaml`) and the `fess_config.properties` base.
- **system.properties**: The live file (`data/fess/opt/fess/system.properties`) is generated from `data/fess/opt/fess/system.properties.template` by `setup.sh` on first run. The live file is git-ignored.
- **Theme files**: The codesearch static theme is fetched from the [fess-themes](https://github.com/codelibs/fess-themes) repository by `setup.sh` and stored in `data/fess/themes/codesearch/`. This directory is mounted into the container at `/usr/share/fess/app/themes/codesearch`.
- **index.filetype**: Source-code aware mimetype→label map, maintained in `conf/fess_config.overlay.properties` (a multi-line value, so it lives in the file rather than a `-D` flag).
- **Management CLI (`fessctl`)**: Repositories are registered and crawls are triggered with [`fessctl`](https://github.com/codelibs/fessctl), the official Fess admin-API CLI (see [Install fessctl](#install-fessctl)).

## Getting Started

### Setup
Expand All @@ -18,6 +28,13 @@ $ cd docker-codesearch
$ bash ./bin/setup.sh
```

`setup.sh` will:
1. Create required data directories
2. Download the Fess data store plugin (fess-ds-git)
3. Fetch the codesearch static theme from fess-themes (if not already present)
4. Generate `data/fess/opt/fess/system.properties` from the template (if not already present)
5. Generate `data/fess/opt/fess/fess_config.properties` from the pinned base + codesearch overlay

### Start the Server

To start the server, use Docker Compose:
Expand All @@ -28,28 +45,77 @@ docker compose -f compose.yaml up -d

Once the server is running, access it at [http://localhost:8080/](http://localhost:8080/).

The first start initializes the search indices in OpenSearch (this can take a minute or two). The site has no documents until you register a repository and run a crawler (see below).

### Create an Access Token

To use the Admin API for Fess, create an access token with the `{role}admin-api` permission on the Admin Access Token page ([http://localhost:8080/admin/accesstoken/](http://localhost:8080/admin/accesstoken/)).
`fessctl` (used in the next steps) authenticates to Fess with an access token. Create one with the `{role}admin-api` permission on the Admin Access Token page ([http://localhost:8080/admin/accesstoken/](http://localhost:8080/admin/accesstoken/)).

For more details, see the [Admin Access Token Guide](https://fess.codelibs.org/14.14/admin/accesstoken-guide.html).
For more details, see the [Admin Access Token Guide](https://fess.codelibs.org/15.7/admin/accesstoken-guide.html).

### Create DataStore Configuration for GitHub
### Install fessctl

You can create DataStore and Scheduler settings on Fess using the `bin/register_github.sh` script:
Repositories are registered and crawls are triggered with [`fessctl`](https://github.com/codelibs/fessctl), the official CLI for the Fess Admin API:

```bash
register_github.sh ACCESS_TOKEN FESS_URL REPO_DOMAIN REPO_ORG REPO_NAME
pipx install fessctl # or: uv tool install fessctl
```

`fessctl` requires Python 3.13+ (`pipx` / `uv` provide it automatically). Point it at the server and the access token created above:

Example:
$ bash ./bin/register_github.sh ...token... http://localhost:8080 github.com codelibs fess
```bash
export FESS_ENDPOINT=http://localhost:8080
export FESS_ACCESS_TOKEN=<your-access-token>
export FESS_VERSION=15.7.0
fessctl ping # reports the search engine status (GREEN when ready)
```

Check the created settings on the DataConfig page ([http://localhost:8080/admin/dataconfig/](http://localhost:8080/admin/dataconfig/)).
> `fessctl` can also be run from its container image (`ghcr.io/codelibs/fessctl`); see the [fessctl README](https://github.com/codelibs/fessctl) for details.

### Register a Repository

Create a Git data store config for each repository you want to index. The `handler-script` maps Git metadata to the codesearch fields (`organization`, `repository`, `filetype`, …) that power the search facets. Replace `codelibs` / `fess-suggest` / `master` with your own organization, repository, and default branch:

### Start the Crawler
```bash
fessctl dataconfig create \
--name "github.com/codelibs/fess-suggest" \
--handler-name GitDataStore \
--handler-parameter 'uri=https://github.com/codelibs/fess-suggest.git
base_url=https://github.com/codelibs/fess-suggest/blob/master/
extractors=text/.*:textExtractor,application/xml:textExtractor,application/javascript:textExtractor,application/json:textExtractor,application/x-sh:textExtractor,application/x-bat:textExtractor,audio/.*:filenameExtractor,chemical/.*:filenameExtractor,image/.*:filenameExtractor,model/.*:filenameExtractor,video/.*:filenameExtractor,
delete_old_docs=false
repository_path=/home/fess/workspace/fess-suggest' \
--handler-script 'url=url
host="github.com"
site="github.com/codelibs/fess-suggest/" + path
title=name
content=container.getComponent("documentHelper").appendLineNumber("L", content)
digest=author.toExternalString()
content_length=contentLength
last_modified=timestamp
timestamp=timestamp
filename=name
mimetype=mimetype
domain="github.com"
organization="codelibs"
repository="fess-suggest"
path=path
repository_url="https://github.com/codelibs/fess-suggest"
filetype=container.getComponent("fileTypeHelper").get(mimetype)' \
--permission "{role}guest"
```

To start the crawler, run `Default Crawler` or `Data Crawler - ...` on the Admin Scheduler page ([http://localhost:8080/admin/scheduler/](http://localhost:8080/admin/scheduler/)).
Review the registered repositories on the [DataConfig page](http://localhost:8080/admin/dataconfig/).

### Run the Crawler

Trigger the built-in **Default Crawler**, which crawls every registered data store config:

```bash
fessctl scheduler start default_crawler
```

It also runs daily on its own schedule, so newly registered repositories are picked up automatically. Follow progress on the [Scheduler page](http://localhost:8080/admin/scheduler/) (Job Log), or watch results appear on the search page.

### Search

Expand All @@ -63,3 +129,50 @@ To stop the server, use the following command:
docker compose -f compose.yaml down
```

## Configuration

### Fess settings (fess_config.properties)

Codesearch-specific `fess_config.properties` settings are maintained as a small delta in `conf/fess_config.overlay.properties`. `setup.sh` (via `bin/render-fess-config.sh`) fetches the upstream base for the pinned `FESS_VERSION` and overlays this delta to generate `data/fess/opt/fess/fess_config.properties`. After editing the overlay, re-run `setup.sh` (or `bash ./bin/render-fess-config.sh`).

**Secrets / per-deployment values** (e.g. the cipher key, the initial admin password) must **not** go in the tracked overlay. Create `conf/fess_config.local.properties` (git-ignored) — its keys are applied last and win:

```properties
app.cipher.key=your-secret-key-here
index.user.initial_password=your-admin-password
```

> The cipher key encrypts stored credentials; set it **before first boot**, because changing it later invalidates already-encrypted data.

### system.properties

To modify system-level Fess settings, edit `data/fess/opt/fess/system.properties.template` and re-run `setup.sh`, or edit the live `data/fess/opt/fess/system.properties` directly. The live file is git-ignored.

## Optional: AI Chat (RAG)

To enable AI-powered chat on search results, add the following to `conf/fess_config.local.properties` (or the overlay) and install an LLM plugin, then re-run `setup.sh`:

```properties
rag.chat.enabled=true
```

AI chat is disabled by default. See [Fess LLM plugins](https://github.com/codelibs?q=fess-llm) for available LLM integrations.

## Updating

To update to the latest code, use plain `git pull`:

```bash
git pull
```

Live/generated files (`system.properties`, `fess_config.properties`, theme assets) are git-ignored and will not be overwritten by `git pull`.

To upgrade the Fess / OpenSearch version, edit the pins in `.env` (`FESS_VERSION`, `OPENSEARCH_VERSION`) and re-run `setup.sh`. The `fess_config.properties` base is re-fetched for the new version and the codesearch overlay is re-applied automatically:

```bash
bash ./bin/setup.sh
docker compose -f compose.yaml up -d
```

> **Re-index after a major version bump**: a Fess or OpenSearch major upgrade can change the index format. If search returns errors or stops returning results after upgrading, re-crawl your repositories with `fessctl scheduler start default_crawler` to rebuild the index.
25 changes: 0 additions & 25 deletions bin/git_pull.sh

This file was deleted.

Loading