You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why are these changes being introduced:
* We are planning a migration to AOSS
* We need to maintain our existing AWS OpenSearch Service (ES)
integration while migrating to AOSS
Relevant ticket(s):
* https://mitlibraries.atlassian.net/browse/USE-423
How does this address that need:
* Added support for AWS OpenSearch Serverless (AOSS) using either
expiring credentials by passing a session token or by assuming a role.
* Configured the application to support AWS OpenSearch Serverless (AOSS)
in addition to the existing AWS OpenSearch Service (ES).
* Added logic to choose the appropriate client based on environment variables.
* Implemented AWS SigV4 signing for AOSS authentication.
Document any side effects to this change:
* Updated lambda configuration to support session tokens. It does not
have assume role configuration at this time, but we needed to support
temporary credentials in the lambda to support them locally in
OpenSearch Serverless (AOSS) so I included that in this change.
* Reorganized documentation on environment variables
* Allow changing log level in development
* Changed a log level for a metric we aren't using yet
-[AWS Credentials (Used for AWS-based OpenSearch and timdex-semantic-builder)](#aws-credentials-used-for-aws-based-opensearch-and-timdex-semantic-builder)
23
+
-[AWS OpenSearch Service (Legacy)](#aws-opensearch-service-legacy)
Once the jekyll server is running, you can access the local docs at http://localhost:4000/timdex/
150
+
Once the jekyll server is running, you can access the local docs at <http://localhost:4000/timdex/>
131
151
132
152
Note: it is important to load the documentation from the `/timdex/` path locally as that is how it works when built and deployed to GitHub Pages so testing locally the same way will ensure our asset paths will work when deployed.
133
153
@@ -146,51 +166,70 @@ The config file `./docs/reference/_spectaql_config.yml` controls the build proce
146
166
and making changes to this file (which is included in version control) would be the main reason to run the process
147
167
locally.
148
168
149
-
## Required Environment Variables (all ENVs)
150
-
151
-
-`EMAIL_FROM`: email address to send message from, including the registration
152
-
and forgot password messages.
153
-
-`EMAIL_URL_HOST` - base url to use when sending emails that link back to the
154
-
application. In development, often `localhost:3000`. On heroku, often
155
-
`yourapp.herokuapp.com`. However, if you use a custom domain in production,
156
-
that should be the value you use in production.
157
-
-`JWT_SECRET_KEY`: generate with `rails secret`
169
+
## General Configuration
158
170
159
-
##Production required Environment Variables
171
+
### Name and Domain
160
172
161
-
-`AWS_ACCESS_KEY_ID`: AWS credentials for OpenSearch and Lambda
162
-
-`AWS_SECRET_ACCESS_KEY`: AWS credentials for OpenSearch and Lambda
163
-
-`AWS_REGION`: AWS region for OpenSearch and Lambda services
164
-
-`AWS_OPENSEARCH`: boolean. Set to true to enable AWSv4 Signing for OpenSearch
165
-
-`OPENSEARCH_INDEX`: Opensearch index or alias to query, default will be to search all indexes which is generally not
166
-
expected. `timdex` or `all-current` are aliases used consistently in our data pipelines, with
167
-
`timdex` being most likely what most use cases will want.
168
-
-`OPENSEARCH_URL`: Opensearch URL, defaults to `http://localhost:9200`
169
-
-`TIMDEX_SEMANTIC_BUILDER_FUNCTION_NAME`: AWS Lambda function name with alias for semantic query building.
170
-
Configurable to use alternative deployment tiers (e.g., dev1, stage, prod).
171
-
-`SMTP_ADDRESS`
172
-
-`SMTP_PASSWORD`
173
-
-`SMTP_PORT`
174
-
-`SMTP_USER`
175
-
176
-
## Optional Environment Variables (all ENVs)
177
-
178
-
-`AWS_SESSION_TOKEN`: AWS session token for temporary credentials when using expiring AWS credentials
179
-
-`OPENSEARCH_LOG` if `true`, verbosely logs OpenSearch queries.
180
-
181
-
```text
182
-
NOTE: do not set this ENV at all if you want ES logging fully disabled.
183
-
Setting it to `false` is still setting it and you will be annoyed and
184
-
confused.
185
-
```
186
-
-`OPENSEARCH_SOURCE_EXCLUDES` comma separated list of fields to exclude from the OpenSearch `_source` field. Leave unset to return all fields.
-`PLATFORM_NAME`: The value set is added to the header after the MIT Libraries logo. The logic and CSS for this comes from our theme gem.
189
-
-`PREFERRED_DOMAIN` - set this to the domain you would like to to use. Any
190
-
other requests that come to the app will redirect to the root of this domain.
191
-
This is useful to prevent access to herokuapp.com domains.
192
-
-`REQUESTS_PER_PERIOD` - requests allowed before throttling. Default is 100.
193
-
-`REQUEST_PERIOD` - number of minutes for the period in `REQUESTS_PER_PERIOD`.
194
-
Default is 1.
174
+
-`PREFERRED_DOMAIN`: set this to the domain you would like to use. Any other requests that come to the app will redirect to the root of this domain. This is useful to prevent access to herokuapp.com domains.
175
+
176
+
### Authentication
177
+
178
+
-`JWT_SECRET_KEY`: generate with `rails secret`**required**
179
+
180
+
### Email Configuration
181
+
182
+
-`EMAIL_FROM`: email address to send message from, including the registration and forgot password messages. **required**
183
+
-`EMAIL_URL_HOST`: base url to use when sending emails that link back to the application. In development, often `localhost:3000`. On heroku, often `yourapp.herokuapp.com`. However, if you use a custom domain in production, that should be the value you use in production. **required**
184
+
-`SMTP_ADDRESS`: SMTP server address (Required for production)
185
+
-`SMTP_PORT`: SMTP server port (Required for production)
186
+
-`SMTP_USER`: SMTP authentication user (Required for production)
187
+
-`SMTP_PASSWORD`: SMTP authentication password (Required for production)
188
+
189
+
### Observability (Optional)
190
+
191
+
-`RAILS_LOG_LEVEL`: defaults to debug in development and info in production
195
192
-`SENTRY_DSN`: client key for Sentry exception logging
196
193
-`SENTRY_ENV`: Sentry environment for the application. Defaults to 'unknown' if unset.
194
+
195
+
### Rate Limiting (Optional)
196
+
197
+
-`REQUESTS_PER_PERIOD`: requests allowed before throttling. Default is 100.
198
+
-`REQUEST_PERIOD`: number of minutes for the period in `REQUESTS_PER_PERIOD`. Default is 1.
199
+
200
+
## AWS Configuration
201
+
202
+
### OpenSearch Configuration
203
+
204
+
-`OPENSEARCH_URL`: OpenSearch endpoint URL, defaults to `http://localhost:9200`
205
+
-`OPENSEARCH_INDEX`: OpenSearch index or alias to query. Defaults to searching all indexes (generally not recommended). `timdex` or `all-current` are aliases used consistently in our data pipelines, with `timdex` being most likely what most use cases will want. **required**
206
+
-`OPENSEARCH_LOG`: if set to `true` (case-insensitive), verbosely logs OpenSearch queries. Leave unset, or set to any other value such as `false`, to keep OpenSearch logging disabled.
207
+
-`OPENSEARCH_SOURCE_EXCLUDES`: comma-separated list of fields to exclude from the OpenSearch `_source` field. Leave unset to return all fields. Recommended value: `embedding_full_record,fulltext`
208
+
209
+
### AWS Credentials (Used for AWS-based OpenSearch and timdex-semantic-builder)
210
+
211
+
-`AWS_ACCESS_KEY_ID`: AWS access key for OpenSearch and Lambda
212
+
-`AWS_SECRET_ACCESS_KEY`: AWS secret key for OpenSearch and Lambda
213
+
-`AWS_REGION`: AWS region for OpenSearch and Lambda services
214
+
-`AWS_SESSION_TOKEN`: (Optional) AWS session token for temporary credentials when using expiring AWS credentials.
215
+
Use this with temporary AWS credentials for AWS-based OpenSearch access and Lambda.
216
+
For AOSS, when this is set, temporary credentials are used directly and `AWS_AOSS_ROLE_ARN` is not needed.
217
+
218
+
### AWS OpenSearch Service (Legacy)
219
+
220
+
This is our legacy AWS OpenSearch Service Cluster. All production instances should use this until our migration to Serverless (AOSS) is complete.
221
+
222
+
-`AWS_OPENSEARCH`: boolean. Set to `true` to enable AWS SigV4 signing for AWS OpenSearch Service. This is the legacy approach and will be replaced with `AWS_AOSS` when we complete our migration to Serverless.
223
+
224
+
### AWS OpenSearch Serverless (AOSS)
225
+
226
+
This is our upcoming configuration once migration is complete. This uses a different [authentication mechanism](https://github.com/awsdocs/amazon-opensearch-service-developer-guide/blob/master/doc_source/serverless-clients.md#ruby) than our legacy AWS OpenSearch Service.
227
+
228
+
-`AWS_AOSS`: boolean. Set to `true` to enable AWS OpenSearch Serverless (AOSS).
229
+
-`AWS_AOSS_ROLE_ARN`: AWS IAM role ARN to assume for AOSS authentication. **Required when**`AWS_AOSS=true`**and**`AWS_SESSION_TOKEN` is not set. This enables automatic credential refresh via role assumption.
230
+
When `AWS_SESSION_TOKEN` is present, temporary credentials are used directly and `AWS_AOSS_ROLE_ARN` is not needed. This is only used in local development. `AWS_AOSS_ROLE_ARN` is used in production.
231
+
232
+
### TIMDEX Semantic Builder Lambda
233
+
234
+
-`TIMDEX_SEMANTIC_BUILDER_FUNCTION_NAME`: AWS Lambda function name with alias for semantic query building.
235
+
Configurable to use alternative deployment tiers (e.g., dev1, stage, prod). Generally takes the format `function_name:live` where `live` is the alias. Failure to include the alias will result in extremely slow performance at best. Use the alias. Note: the lambda must be in the same AWS account as OpenSearch. If you want to test dev1 OpenSearch, you must also switch the lambda name to a dev1 variant.
0 commit comments