Choosing Hardware
Server Requirements for Tiledesk
Minimum Infrastructure
The minimum recommended setup for hosting Tiledesk involves two separate servers:
Server 1: To host the Tiledesk Server, Dashboard, Widget, and Web Chat.
Server 2: Dedicated to MongoDB.
This basic infrastructure allows the servers to be up and running but does not account for advanced quality parameters such as performance, high availability, backups, redundancy, and disaster recovery.
Instance Type Recommendation: The default instance type recommended is an AWS EC2 t2.small (or equivalent), which is sufficient for testing environments. For production environments, a c4.large instance is ideal, providing better performance and stability.
Alternative with Heroku: If you prefer using Heroku, we recommend at least the Hobby dyno type for small applications, although Tiledesk can also run on the Free type for development and testing.
Production Environment Configuration
For production environments with higher traffic and reliability requirements, consider a more robust server configuration:
Load Balancer: To manage traffic and increase availability by distributing the load across multiple instances.
Auto-scaling: To dynamically adjust server capacity based on traffic.
Backup and Disaster Recovery: Enable regular automated backups and define a disaster recovery plan.
Front-end Components
To serve Tiledesk’s front-end components (Dashboard, Widget, and Web Chat), we suggest using AWS S3 + CloudFront. This setup enables efficient and scalable distribution of static content.
Traffic Optimization
Enable Gzip Compression on CloudFront to reduce network traffic and improve loading performance. More info here: https://aws.amazon.com/blogs/aws/new-gzip-compression-support-for-amazon-cloudfront/
Database
We recommend the following configurations for MongoDB:
MongoDB Options
MongoDB Atlas:
M10: Suitable for testing or low-traffic applications. It supports continuous backups for data security.
M30 or higher: Recommended for high-traffic applications or production environments, providing better performance and scalability.
Local Installation in Replica Set Mode:
Replica Set with at least three nodes, each located in a separate Availability Zone to improve database reliability and availability. This setup reduces the risk of data loss and ensures greater service continuity.
Additional Database Management Tips
Sharding (for high workloads): In cases of heavy database usage, consider implementing sharding to distribute the load and enhance performance.
Performance Monitoring: Use tools like MongoDB Compass or integrations with monitoring systems (e.g., CloudWatch) to track the database status and optimize resource allocation.
Optional: Hosting Large Language Models (LLMs)
If you plan to host a Large Language Model (LLM) such as LLaMA 3, which requires substantial computational resources, here are additional hardware specifications and recommendations:
Hardware Specifications for LLM Hosting
GPU
Architecture: NVIDIA Ampere or newer (e.g., A100, A6000, H100).
GPU Memory: At least 40 GB of HBM2 or higher (for larger models).
CUDA Cores: At least 6,912 cores to handle intensive AI workloads.
NVLink: If available, use NVLink to connect multiple GPUs, increasing communication bandwidth between GPUs and improving performance for larger models.
CPU
Multi-core CPU: Minimum 16 cores (32 threads) to support distributed workloads. Preferably a recent AMD EPYC or Intel Xeon CPU.
Clock Speed: At least 2.5 GHz per core.
RAM
RAM Amount: Minimum 128 GB DDR4/DDR5. For particularly large models, consider 256 GB or more, especially if the model will handle multiple requests simultaneously.
Storage
Storage Type: NVMe SSD for faster data access and improved I/O performance.
Storage Capacity: Minimum 1 TB, with scalability based on the model size and any auxiliary data. For long-term projects, consider a distributed storage system.
Additional Optimization Suggestions
Cooling and Power: High-performance GPUs like the A100 or H100 require adequate cooling and stable power supply. Ensure that the cooling infrastructure is sufficient, especially for on-premises setups.
Cluster Configurations: For large-scale applications, consider using a GPU cluster, for example, via a Kubernetes system with GPU support. This allows for elastic scaling based on workload and provides redundancy.
Networking: For clusters, consider using a high-speed network (e.g., InfiniBand) to minimize latency between nodes and enhance overall performance.
Containers and Virtualization: Use Docker or other containers for rapid deployment and to ensure portability of the model across different platforms.
Resource Management and Monitoring: Use tools like NVIDIA’s GPU Cloud (NGC) to monitor and manage resources, optimizing GPU utilization and maintaining system stability.
Cloud Hosting Considerations
Tiledesk can be deployed in private or public cloud environments, utilizing dedicated hardware configurations provided by cloud providers such as AWS, Azure, or Google Cloud, with support for high-level GPUs like NVIDIA A100 and V100. This approach offers scalability and simplified management without the need for maintaining physical hardware.
This guide provides a solid and scalable setup for running Tiledesk in both development and production environments, with optional infrastructure suggestions for hosting Large Language Models if needed, allowing your infrastructure to grow based on traffic demands.
Last updated