Distributed James Server — blobstore.properties

BlobStore

This file is optional. If omitted, the cassandra blob store will be used.

BlobStore is the dedicated component to store blobs, non-indexable content. James uses the BlobStore for storing blobs which are usually mail contents, attachments, deleted mails…​

You can choose the underlying implementation of BlobStore to fit with your James setup.

It could be the implementation on top of Cassandra or file storage service S3 compatible like Openstack Swift and AWS S3.

Consult blob.properties in GIT to get some examples and hints.

Implementation choice

implementation :

  • cassandra: use cassandra based BlobStore

  • objectstorage: use Swift/AWS S3 based BlobStore

  • file: (experimental) use directly the file system. Useful for legacy architecture based on shared ISCI SANs and/or distributed file system with no object store available.

JAMES-3591 Cassandra is not made to store large binary content, its use will be suboptimal compared to Alternatives (namely S3 compatible BlobStores backed by for instance S3, MinIO or Ozone)

The generated startup warning log can be deactivated via the cassandra.blob.store.disable.startup.warning environment variable being positioned to false.

deduplication.enable: Mandatory. Supported value: true and false.

If you choose to enable deduplication, the mails with the same content will be stored only once.

Once this feature is enabled, there is no turning back as turning it off will lead to the deletion of all the mails sharing the same content once one is deleted.
If you are upgrading from James 3.5 or older, the deduplication was enabled.

Deduplication requires a garbage collector mechanism to effectively drop blobs. A first implementation based on bloom filters can be used and triggered using the WebAdmin REST API. See Running blob garbage collection.

In order to avoid concurrency issues upon garbage collection, we slice the blobs in generation, the two more recent generations are not garbage collected.

deduplication.gc.generation.duration: Allow controlling the duration of one generation. Longer implies better deduplication but deleted blobs will live longer. Duration, defaults on 30 days, the default unit is in days.

deduplication.gc.generation.family: Every time the duration is changed, this integer counter must be incremented to avoid conflicts. Defaults to 1.

Encryption choice

Data can be optionally encrypted with a symmetric key using AES before being stored in the blobStore. As many user relies on third party for object storage, a compromised third party will not escalate to a data disclosure. Of course, a performance price have to be paid, as encryption takes resources.

encryption.aes.enable : Optional boolean, defaults to false.

If AES encryption is enabled, then the following properties MUST be present:

  • encryption.aes.password : String

  • encryption.aes.salt : Hexadecimal string

The following properties CAN be supplied:

  • encryption.aes.private.key.algorithm : String, defaulting to PBKDF2WithHmacSHA512. Previously was PBKDF2WithHmacSHA1.

Once chosen this choice can not be reverted, all the data is either clear or encrypted. Mixed encryption is not supported.

Here is an example of how you can generate the above values (be mindful to customize the byte lengths in order to add enough entropy.

# Password generation
openssl rand -base64 64

# Salt generation
generate salt with : openssl rand -hex 16

AES blob store supports the following system properties that could be configured in jvm.properties:

# Threshold from which we should buffer the blob to a file upon encrypting
# Unit supported: K, M, G, default to no unit
james.blob.aes.file.threshold.encrypt=100K

# Threshold from which we should buffer the blob to a file upon decrypting
# Unit supported: K, M, G, default to no unit
james.blob.aes.file.threshold.decrypt=256K

# Maximum size of a blob. Larger blobs will be rejected.
# Unit supported: K, M, G, default to no unit
james.blob.aes.blob.max.size=100M

Cassandra BlobStore Cache

A Cassandra cache can be enabled to reduce latency when reading small blobs frequently. A dedicated keyspace with a replication factor of one is then used. Cache eviction policy is TTL based. Only blobs below a given threshold will be stored. To be noted that blobs are stored within a single Cassandra row, hence a low threshold should be used.

Table 1. blobstore.properties cache related content
Property name explanation

cache.enable

DEFAULT: false, optional, must be a boolean. Whether the cache should be enabled.

cache.cassandra.ttl

DEFAULT: 7 days, optional, must be a duration. Cache eviction policy is TTL based.

cache.sizeThresholdInBytes

DEFAULT: 8192, optional, must be a positive integer. Unit: bytes. Supported units: bytes, Kib, MiB, GiB, TiB Maximum size of stored objects expressed in bytes.

Object storage configuration

AWS S3 Configuration

Table 2. blobstore.properties S3 related properties
Property name explanation

objectstorage.s3.endPoint

S3 service endpoint

objectstorage.s3.region

S3 region

objectstorage.s3.accessKeyId

S3 access key id

objectstorage.s3.secretKey

S3 access key secret

objectstorage.s3.http.concurrency

Allow setting the number of concurrent HTTP requests allowed by the Netty driver.

objectstorage.s3.truststore.path

optional: Verify the S3 server certificate against this trust store file.

objectstorage.s3.truststore.type

optional: Specify the type of the trust store, e.g. JKS, PKCS12

objectstorage.s3.truststore.secret

optional: Use this secret/password to access the trust store; default none

objectstorage.s3.truststore.algorithm

optional: Use this specific trust store algorithm; default SunX509

objectstorage.s3.trustall

optional: boolean. Defaults to false. Cannot be set to true with other trustore options. Wether James should validate S3 endpoint SSL certificates.

objectstorage.s3.read.timeout

optional: HTTP read timeout. duration, default value being second. Leaving it empty relies on S3 driver defaults.

objectstorage.s3.write.timeout

optional: HTTP write timeout. duration, default value being second. Leaving it empty relies on S3 driver defaults.

objectstorage.s3.connection.timeout

optional: HTTP connection timeout. duration, default value being second. Leaving it empty relies on S3 driver defaults.

objectstorage.s3.in.read.limit

optional: Object read in memory will be rejected if they exceed the size limit exposed here. Size, exemple 100M. Supported units: K, M, G, defaults to B if no unit is specified. If unspecified, big object won’t be prevented from being loaded in memory. This settings complements protocol limits.

objectstorage.s3.upload.retry.maxAttempts

optional: Integer. Default is zero. This property specifies the maximum number of retry attempts allowed for failed upload operations.

objectstorage.s3.upload.retry.backoffDurationMillis

optional: Long (Milliseconds). Default is 10 (miliseconds). Only takes effect when the "objectstorage.s3.upload.retry.maxAttempts" property is declared. This property determines the duration (in milliseconds) to wait between retry attempts for failed upload operations. This delay is known as backoff. The jitter factor is 0.5

Buckets Configuration

Table 3. Bucket configuration
Property name explanation

objectstorage.bucketPrefix

Bucket is a concept in James and similar to Containers in Swift or Buckets in AWS S3. BucketPrefix is the prefix of bucket names in James BlobStore

objectstorage.namespace

BlobStore default bucket name. Most of blobs storing in BlobStore are inside the default bucket. Unless a special case like storing blobs of deleted messages.

Blob Export

Blob Exporting is the mechanism to help James to export a blob from an user to another user. It is commonly used to export deleted messages (consult <a href="/server/config-vault">configuring deleted messages vault</a>). The deleted messages are transformed into a blob and James will export that blob to the target user.

This configuration helps you choose the blob exporting mechanism fit with your James setup and it is only applicable with Guice products.

Consult blob.properties in GIT to get some examples and hints.

Configuration for exporting blob content:

Table 4. blobstore.properties content
blob.export.implementation

localFile: Local File Exporting Mechanism (explained below). Default: localFile

linshare: LinShare Exporting Mechanism (explained below)

Local File Blob Export Configuration

For each request, this mechanism retrieves the content of a blob and save it to a distinct local file, then send an email containing the absolute path of that file to the target mail address.

Note: that absolute file path is the file location on James server. Therefore, if there are two or more James servers connected, it should not be considered an option.

blob.export.localFile.directory: The directory URL to store exported blob data in files, and the URL following James File System scheme. Default: file://var/blobExporting

LinShare Blob Export Configuration

Instead of exporting blobs in local file system, using LinShare helps you upload your blobs and people you have been shared to can access those blobs by accessing to LinShare server and download them.

This way helps you to share via whole network as long as they can access to LinShare server.

To get an example or details explained, visit blob.properties

blob.export.linshare.url: The URL to connect to LinShare

blob.export.linshare.token: The authentication token to connect to LinShare